LiveFR

Context caching (prompt caching)

Definition: Context caching reuses an already-processed portion of the prompt (instructions, documents) instead of recomputing it on each call, cutting latency and cost.

It helps when the same large context recurs across requests, such as a long document or a stable system prompt. Cached content has a limited lifetime.

See also

← Full AI glossary · AI news