Lesson 4 · 8 min
Prompt Caching
5-minute server-side cache that cuts tokens by 90% on hits.
Mark a content block as cacheable; subsequent calls that share that block read it for ~10% of input cost. Cache lives for 5 minutes. Architect prompts so static content is first and shared across calls — that's how you get hits.
Quick check
Prompt EngineeringSelect one
You're using JSON mode and want bulletproof structured output. What's the additional safety layer?
