feat: prompt caching
Is your feature request related to a problem? Please describe.
The official Claude API supports prompt caching, but using this bedrock access gateway does not meaning it costs up to 90% more.
Describe the feature you'd like
Prompt caching - https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching
Additional context
N/A
Prompt caching in Bedrock is still in preview. May consider adding the support once it's GA.
any updates?
Seriously this is a biggie. Prompt caching in bedrock is GA. This makes BAG unusable (90% cost difference is unusable, c'mon). Can we get any info here? TIA.
@daixba Keep us posted chief ❤
GCP VertexAI with Anthropic models has worked with prompt caching for a long time (as have Anthropic, DeepSeek, OpenAI etc.), it's quite a bit faster than Bedrock at the moment as well so it's been a good alternative for me.
Honestly I don't know what's up with Bedrock as a product, it feels like AWS are throwing all their resources at trying to sell Amazon Q rather than concentrating on their core services.
Bumping this. This has a MAJOR impact on cost and without this using Bedrock via BAG is not affordable. Prompt caching should be fairly straight forward to implement for specific models like Nova and Claude using minor modifications to the requests to set the cache policy. Caching has been available since May and unclear why this is delayed. We have been in direct contact with our Amazon representatives and they are also unclear why this is blocked for so long.
implemented in https://github.com/aws-samples/bedrock-access-gateway/commit/b4800c54a05b15b6247662e8e66c3d80124b4f84
Awesome. Thank you!