continue
continue copied to clipboard
Cache turn-by-turn conversation in Anthropic Claude
Description
This implements turn-by-turn conversation caching for Claude, in addition to the pre-existing system message caching. See: https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching
Perhaps we should enable this by default?
Checklist
- [x] The base branch of this PR is
dev, rather thanmain - [x] The relevant docs, if any, have been updated or created
Testing
- Add
"cacheConversation": trueto the anthropic config. Like:
{
"title": "Claude 3.5 Sonnet",
"provider": "anthropic",
"model": "claude-3-5-sonnet-20240620",
"apiKey": "<YOUR_API_KEY>",
"cacheSystemMessage": true,
"cacheConversation": true
}
- uncomment the following line in Anthropic.ts:
if (value.type == "message_start") console.log(value); - Select the claude model, add at least 1000 tokens context
- When running in the debugger, see in the debug console things like:
{input_tokens: 4, cache_creation_input_tokens: 1212, cache_read_input_tokens: 0, output_tokens: 2}
then the next request:
{input_tokens: 4, cache_creation_input_tokens: 224, cache_read_input_tokens: 1212, output_tokens: 1}
Hopefully Anthropic will soon put caching statistics in their api console as well.