Messages API, Tool Use & Architecture

20% of the exam

Messages API, tool use, streaming, error handling and production patterns.

The Messages API

  • Conversation = alternating list {role: user|assistant, content}. The system prompt is a separate parameter.
  • Key params: model, max_tokens (required), temperature, stop_sequences, system, tools.
  • Streaming (SSE) returns token-by-token for perceived latency.

Tool use

  • Declare tools (name, description, input_schema JSON Schema). Claude returns a tool_use block.
  • Loop: Claude → tool_use → your code runs it → you return tool_result → Claude continues.
  • tool_choice forces/frees usage; tools guarantee structured outputs.

Robustness

  • Handle 429 (rate limit) with exponential backoff + jitter; 529 (overloaded) with retry.
  • Timeouts, request-id logging, graceful degradation, idempotency and caching.

Practice — 10 questions

0/10 answered
  1. 1. Where do role/persistent rules go in the Messages API?
  2. 2. Correct tool-use cycle?
  3. 3. Guarantee reliable structured extraction?
  4. 4. The API returns 429s. Correct strategy?
  5. 5. Which parameter is required on every call?
  6. 6. Output is cut and stop_reason is 'max_tokens'. What to do?
  7. 7. Who is responsible for keeping conversation history?
  8. 8. Error 529 (overloaded). Appropriate reaction?
  9. 9. Can Claude request multiple tool calls in a single turn?
  10. 10. Best approach to reduce a chat's perceived latency?

← Back to the Academy · Mock exam →