spring-ai icon indicating copy to clipboard operation
spring-ai copied to clipboard

feat(anthropic): add support for prompt caching

Open tzolov opened this issue 7 months ago • 1 comments
trafficstars

NOTE: This is a rebased version of the original https://github.com/spring-projects/spring-ai/pull/1413 PR by @Claudio-code.

@Claudio-code is this the original author for this PR:

Implements Anthropic's prompt caching feature to improve token efficiency.

  • Adds cache control support in AnthropicApi and AnthropicChatModel
  • Creates AnthropicCacheType enum with EPHEMERAL cache type
  • Extends AbstractMessage and UserMessage to support cache parameters
  • Updates Usage tracking to include cache-related token metrics
  • Adds integration test to verify prompt caching functionality

This implementation follows Anthropic's prompt caching API (beta-2024-07-31) which allows for more efficient token usage by caching frequently used prompts.

tzolov avatar Apr 08 '25 09:04 tzolov

Hello, it seems right to me, thank you for updating the branch.

Claudio-code avatar Apr 24 '25 02:04 Claudio-code

@tzolov @Claudio-code Can we merge this?

zhahaoyu avatar Jul 29 '25 05:07 zhahaoyu

This implementation has a bug, the cache needs to be added only to the last message/tool on the list, you would need to make this change to be able to merge. https://github.com/langchain4j/langchain4j/pull/3337

Claudio-code avatar Aug 07 '25 23:08 Claudio-code

Without this fix the user will reach the cache limit faster than it should and this will generate an error in the anthropic API

Claudio-code avatar Aug 07 '25 23:08 Claudio-code

@Claudio-code Thanks for the ping. I am reviewing the PR right now and trying to merge. Is the bug you mentioned applicable to the user messages? The langchain code you pointed out is adding the last message for system messages. Thanks!

sobychacko avatar Aug 08 '25 00:08 sobychacko

@Claudio-code Are you planning to address this fix in this PR? In that case, feel free to take this PR and rebase against the latest main and then add your fix. Either way, please let us know here. Thanks!

sobychacko avatar Aug 08 '25 00:08 sobychacko

It's because there people didn't want to add support for user messages because of the cache limitation of up to 4 cached items, the bug is in all messages and tools that can have cache

Claudio-code avatar Aug 09 '25 22:08 Claudio-code

I made fix https://github.com/spring-projects/spring-ai/pull/4100

But this implementation doesn't support cached system messages, but adding this support requires changing the structure because currently the system message is just a string and would need to be a ContentBlock instance.

Claudio-code avatar Aug 10 '25 03:08 Claudio-code

New PR issued: https://github.com/spring-projects/spring-ai/pull/4199.

sobychacko avatar Aug 21 '25 01:08 sobychacko