Throughput
Definition: Throughput measures how much work an AI system processes per unit of time, for example in tokens per second or requests per second.
It complements latency: a service can answer one user fast (low latency) while serving many parallel requests (high throughput). It depends on the model, hardware and batching.