genkit icon indicating copy to clipboard operation
genkit copied to clipboard

Support Prompt Caching for Anthropic

Open MichaelDoyle opened this issue 1 year ago • 3 comments

This one is really interesting. Not sure the best way to go about it (extend Genkit's message abstraction, middleware, etc). Worth additional discussion.

See: https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching

MichaelDoyle avatar Aug 24 '24 01:08 MichaelDoyle

@MichaelDoyle is this available through Anthropic on Vertex AI? Otherwise, we can generally make this possible to implement, but the actual change will need to happen in the community plugin repo.

chrisraygill avatar Sep 05 '24 17:09 chrisraygill

Agreed - I suspect there is a brief discussion that needs to happen on "how" we want this to be supported in the core library, and then implementation details fall into the plugins.

When I get some free cycles I can look closer at Vertex. I suspect it might be supported in the Anthropic SDK but not Vertex AI SDK. That's how it appears to me based on the change log: https://github.com/anthropics/anthropic-sdk-typescript/compare/vertex-sdk-v0.4.0...vertex-sdk-v0.4.1

MichaelDoyle avatar Sep 06 '24 14:09 MichaelDoyle

This will be a huge cost savings especially during development and evaluations. Highly recommend supporting this!

nalin avatar Mar 28 '25 14:03 nalin

We can't go for Genkit because of lack of this. Prompt caching decreases Anthropic costs by number of times with a proper use. So without caching our bills would be 2-3 times higher.

We can do some coding around that if needed on our side. But what is a proper way?

auzhva avatar Oct 05 '25 16:10 auzhva