Support Prompt Caching with Vertex
To my knowledge, prompt caching isn't supported when using Claude on Vertex, either via the messages API or the SDKs. Is there any ETA on when that will be added?
We use Claude through vertex ai too, keeping an eye on this.
That's a bummer for us and we might need to switch model if this is not supported soon.
+1 Any updates on enabling context caching for Claude available through VertexAI?
It is supported here: https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/claude-prompt-caching
This is absolutely critical for things like agentic coding, where prompt caching results in around a 98% saving in cost.
Prompt caching works really well with Claude models GCP VertexAI directly, so this will be a bit of a blocker for a lot of folks wanting to use Claude on GCP, or using Claude models and wondering why they're paying 98% more.