New End User
Allow creating a new Customer
user_id: str - The unique identifier for the user.
alias: Optional[str] - A human-friendly alias for the user.
blocked: bool - Flag to allow or disallow requests for this end-user. Default is False.
max_budget: Optional[float] - The maximum budget allocated to the user. Either 'max_budget' or 'budget_id' should be provided, not both.
budget_id: Optional[str] - The identifier for an existing budget allocated to the user. Either 'max_budget' or 'budget_id' should be provided, not both.
allowed_model_region: Optional[Union[Literal["eu"], Literal["us"]]] - Require all user requests to use models in this specific region.
default_model: Optional[str] - If no equivalent model in the allowed region, default all requests to this model.
metadata: Optional[dict] = Metadata for customer, store information for customer. Example metadata = {"data_training_opt_out": True}
budget_duration: Optional[str] - Budget is reset at the end of specified duration. If not set, budget is never reset. You can set duration as seconds ("30s"), minutes ("30m"), hours ("30h"), days ("30d").
tpm_limit: Optional[int] - [Not Implemented Yet] Specify tpm limit for a given customer (Tokens per minute)
rpm_limit: Optional[int] - [Not Implemented Yet] Specify rpm limit for a given customer (Requests per minute)
model_max_budget: Optional[dict] - [Not Implemented Yet] Specify max budget for a given model. Example: {"openai/gpt-4o-mini": {"max_budget": 100.0, "budget_duration": "1d"}}
max_parallel_requests: Optional[int] - [Not Implemented Yet] Specify max parallel requests for a given customer.
soft_budget: Optional[float] - [Not Implemented Yet] Get alerts when customer crosses given budget, doesn't block requests.
Allow specifying allowed regions
Allow specifying default model
Example curl:
curl --location '' --header 'Authorization: Bearer sk-1234' --header 'Content-Type: application/json' --data '{
"user_id" : "ishaan-jaff-3",
"allowed_region": "eu",
"budget_id": "free_tier",
"default_model": "azure/gpt-3.5-turbo-eu" <- all calls from this user, use this model?
# return end-user object
NOTE: This used to be called /end_user/new
, we will still be maintaining compatibility for /end_user/XXX for these endpoints
Requests will fail if this budget (in USD) is exceeded.
Requests will NOT fail if this is exceeded. Will fire alerting though.
Max concurrent requests allowed for this budget id.
Max tokens per minute, allowed for this budget id.
Max requests per minute, allowed for this budget id.
Max duration budget should be set for (e.g. '1hr', '1d', '28d')
Max budget for each model (e.g. {'gpt-4o': {'max_budget': '0.0000001', 'budget_duration': '1d', 'tpm_limit': 1000, 'rpm_limit': 1000}})
Successful Response