How can I add prompt caching directive?
I know how to add beta header but can't add cache_control directive to Message...
like ...
{
"type": "text",
"text": "<the entire contents of Pride and Prejudice>",
"cache_control": {"type": "ephemeral"}
}
Hi @chew-z we haven't added support for prompt caching, yet. I might be able to get to it this weekend.
It looks like the System object is now an array of Messages, so either a new major version bump or a new "MessageRequest2" with "everything" duplicated... unless I'm misreading their docs.
I'll probably fork and just replace SystemPrompt since I don't care about backwards compatibility, but if a major version is a chosen path I could submit a PR if no one beats me to it.
@kb-sp I think both are accepted. In Anthropic's API reference it seems you can pass a string or object array. I believe the object array has functionality around prompt caching, which we haven't yet implemented. If you want to take a stab at doing that and bump the major version I'm all for it.
https://github.com/madebywelch/anthropic-go/tree/v4.0.0 contains the cache header and config update.
Can you update to go.mod for v4.0.0 to identify itself as /v4? @madebywelch
Also, as I'm prepping for this, implementing the cache control on "user" messages becomes awkward when you have conversation history.
For a RAG case, ideally I'd like to store the content as an element in the SystemPrompt (like Anthropic's docs on prompt caching) so that user/assistant pairs are consistent.
For a first prompt, I have:
- SystemPrompt = "Coaching for the LLM's role" (like an agent spec)
- User = []Block{
- ["...text file contents..."]
- [Initial prompt] < CACHED >
But for a conversation, I need:
- SystemPrompt = ditto
- User = []Block{
- "...text file contents..."
- [User[0]] (first "user" from conversation history)
- The hope is that [User[0]] is identical to [Initial prompt] and will trigger the cache.
- Assistant = []Block{
- [Assistant[0]] (first "assistant from conversation history)
- User = []Block{
- [New prompt]
I think the above works (can't test caching with v4 yet).
But the SystemPrompt approach seems a little more straightforward and predictable:
- SystemPrompt = []Block{
- [Coaching]
- ["...text file contents..."] < CACHED >
- User = []Block{
- [Initial prompt]
Then:
- SystemPrompt = ditto
- User = []Block{
- [User[0]] (aka Initial prompt)
- Assistant = []Block{
- [Assistant[0]]
- User = []Block{
- [New prompt]
In other words, I don't have to hope that [Initial prompt] and [User[0]] from the history happen to be identical. Does that make sense?
@kb-sp Got it. Good idea. I will upgrade SystemPrompt into array of block instead of string.