New

New End User

Allow creating a new Customer

Parameters:

  • user_id: str - The unique identifier for the user.

  • alias: Optional[str] - A human-friendly alias for the user.

  • blocked: bool - Flag to allow or disallow requests for this end-user. Default is False.

  • max_budget: Optional[float] - The maximum budget allocated to the user. Either 'max_budget' or 'budget_id' should be provided, not both.

  • budget_id: Optional[str] - The identifier for an existing budget allocated to the user. Either 'max_budget' or 'budget_id' should be provided, not both.

  • allowed_model_region: Optional[Union[Literal["eu"], Literal["us"]]] - Require all user requests to use models in this specific region.

  • default_model: Optional[str] - If no equivalent model in the allowed region, default all requests to this model.

  • metadata: Optional[dict] = Metadata for customer, store information for customer. Example metadata = {"data_training_opt_out": True}

  • budget_duration: Optional[str] - Budget is reset at the end of specified duration. If not set, budget is never reset. You can set duration as seconds ("30s"), minutes ("30m"), hours ("30h"), days ("30d").

  • tpm_limit: Optional[int] - [Not Implemented Yet] Specify tpm limit for a given customer (Tokens per minute)

  • rpm_limit: Optional[int] - [Not Implemented Yet] Specify rpm limit for a given customer (Requests per minute)

  • model_max_budget: Optional[dict] - [Not Implemented Yet] Specify max budget for a given model. Example: {"openai/gpt-4o-mini": {"max_budget": 100.0, "budget_duration": "1d"}}

  • max_parallel_requests: Optional[int] - [Not Implemented Yet] Specify max parallel requests for a given customer.

  • soft_budget: Optional[float] - [Not Implemented Yet] Get alerts when customer crosses given budget, doesn't block requests.

  • Allow specifying allowed regions

  • Allow specifying default model

Example curl:

curl --location 'http://0.0.0.0:4000/customer/new'         --header 'Authorization: Bearer sk-1234'         --header 'Content-Type: application/json'         --data '{
        "user_id" : "ishaan-jaff-3",
        "allowed_region": "eu",
        "budget_id": "free_tier",
        "default_model": "azure/gpt-3.5-turbo-eu" <- all calls from this user, use this model? 
    }'

    # return end-user object

NOTE: This used to be called /end_user/new, we will still be maintaining compatibility for /end_user/XXX for these endpoints

POST/customer/new
Authorization
Body
budget_idBudget Id
max_budgetMax Budget

Requests will fail if this budget (in USD) is exceeded.

soft_budgetSoft Budget

Requests will NOT fail if this is exceeded. Will fire alerting though.

max_parallel_requestsMax Parallel Requests

Max concurrent requests allowed for this budget id.

tpm_limitTpm Limit

Max tokens per minute, allowed for this budget id.

rpm_limitRpm Limit

Max requests per minute, allowed for this budget id.

budget_durationBudget Duration

Max duration budget should be set for (e.g. '1hr', '1d', '28d')

model_max_budgetModel Max Budget

Max budget for each model (e.g. {'gpt-4o': {'max_budget': '0.0000001', 'budget_duration': '1d', 'tpm_limit': 1000, 'rpm_limit': 1000}})

user_id*User Id
aliasAlias
blockedBlocked
allowed_model_regionAllowed Model Region
default_modelDefault Model
Response

Successful Response

Body
any
Request
const response = await fetch('/customer/new', {
    method: 'POST',
    headers: {
      "Content-Type": "application/json"
    },
    body: JSON.stringify({
      "user_id": "text"
    }),
});
const data = await response.json();
Response
{
  "detail": [
    {
      "loc": [
        "text"
      ],
      "msg": "text",
      "type": "text"
    }
  ]
}