TTFT (time to first token)
Definition: TTFT (time to first token) is the delay between sending a request and the very first token of the response appearing.
It is a key measure of perceived responsiveness, especially with streaming where the user sees text arrive progressively. It depends on prompt length, the model and service load.