spring-ai
spring-ai copied to clipboard
feat(anthropic): add support for prompt caching
NOTE: This is a rebased version of the original https://github.com/spring-projects/spring-ai/pull/1413 PR by @Claudio-code.
@Claudio-code is this the original author for this PR:
Implements Anthropic's prompt caching feature to improve token efficiency.
- Adds cache control support in AnthropicApi and AnthropicChatModel
- Creates AnthropicCacheType enum with EPHEMERAL cache type
- Extends AbstractMessage and UserMessage to support cache parameters
- Updates Usage tracking to include cache-related token metrics
- Adds integration test to verify prompt caching functionality
This implementation follows Anthropic's prompt caching API (beta-2024-07-31) which allows for more efficient token usage by caching frequently used prompts.
Hello, it seems right to me, thank you for updating the branch.
@tzolov @Claudio-code Can we merge this?
This implementation has a bug, the cache needs to be added only to the last message/tool on the list, you would need to make this change to be able to merge. https://github.com/langchain4j/langchain4j/pull/3337
Without this fix the user will reach the cache limit faster than it should and this will generate an error in the anthropic API
@Claudio-code Thanks for the ping. I am reviewing the PR right now and trying to merge. Is the bug you mentioned applicable to the user messages? The langchain code you pointed out is adding the last message for system messages. Thanks!
@Claudio-code Are you planning to address this fix in this PR? In that case, feel free to take this PR and rebase against the latest main and then add your fix. Either way, please let us know here. Thanks!
It's because there people didn't want to add support for user messages because of the cache limitation of up to 4 cached items, the bug is in all messages and tools that can have cache
I made fix https://github.com/spring-projects/spring-ai/pull/4100
But this implementation doesn't support cached system messages, but adding this support requires changing the structure because currently the system message is just a string and would need to be a ContentBlock instance.
New PR issued: https://github.com/spring-projects/spring-ai/pull/4199.