bedrock-access-gateway icon indicating copy to clipboard operation
bedrock-access-gateway copied to clipboard

feat: prompt caching

Open sammcj opened this issue 1 year ago • 1 comments

Is your feature request related to a problem? Please describe.

The official Claude API supports prompt caching, but using this bedrock access gateway does not meaning it costs up to 90% more.

Describe the feature you'd like

Prompt caching - https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching

Additional context

N/A

sammcj avatar Sep 10 '24 23:09 sammcj

Prompt caching in Bedrock is still in preview. May consider adding the support once it's GA.

daixba avatar Dec 18 '24 01:12 daixba

any updates?

sxthunder avatar May 27 '25 14:05 sxthunder

Seriously this is a biggie. Prompt caching in bedrock is GA. This makes BAG unusable (90% cost difference is unusable, c'mon). Can we get any info here? TIA.

@daixba Keep us posted chief ❤

IliaZenkov avatar May 29 '25 22:05 IliaZenkov

GCP VertexAI with Anthropic models has worked with prompt caching for a long time (as have Anthropic, DeepSeek, OpenAI etc.), it's quite a bit faster than Bedrock at the moment as well so it's been a good alternative for me.

Honestly I don't know what's up with Bedrock as a product, it feels like AWS are throwing all their resources at trying to sell Amazon Q rather than concentrating on their core services.

sammcj avatar May 29 '25 22:05 sammcj

Bumping this. This has a MAJOR impact on cost and without this using Bedrock via BAG is not affordable. Prompt caching should be fairly straight forward to implement for specific models like Nova and Claude using minor modifications to the requests to set the cache policy. Caching has been available since May and unclear why this is delayed. We have been in direct contact with our Amazon representatives and they are also unclear why this is blocked for so long.

jamesottera avatar Oct 09 '25 23:10 jamesottera

implemented in https://github.com/aws-samples/bedrock-access-gateway/commit/b4800c54a05b15b6247662e8e66c3d80124b4f84

zxkane avatar Oct 15 '25 03:10 zxkane

Awesome. Thank you!

jamesottera avatar Oct 15 '25 05:10 jamesottera