Rate limit
Definition: A rate limit caps the number of requests or tokens a client can send to an AI API over a given period.
It protects the service from overload and shares capacity across users. When it is hit, the API returns a dedicated error; applications handle it with spaced retries and queues.