anthropic-go icon indicating copy to clipboard operation
anthropic-go copied to clipboard

How can I add prompt caching directive?

Open chew-z opened this issue 1 year ago • 7 comments

I know how to add beta header but can't add cache_control directive to Message...

Prompt Caching

like ...

      {
        "type": "text", 
        "text": "<the entire contents of Pride and Prejudice>",
        "cache_control": {"type": "ephemeral"}
      }

chew-z avatar Aug 24 '24 10:08 chew-z

Hi @chew-z we haven't added support for prompt caching, yet. I might be able to get to it this weekend.

madebywelch avatar Aug 24 '24 12:08 madebywelch

It looks like the System object is now an array of Messages, so either a new major version bump or a new "MessageRequest2" with "everything" duplicated... unless I'm misreading their docs.

I'll probably fork and just replace SystemPrompt since I don't care about backwards compatibility, but if a major version is a chosen path I could submit a PR if no one beats me to it.

kb-sp avatar Sep 17 '24 05:09 kb-sp

@kb-sp I think both are accepted. In Anthropic's API reference it seems you can pass a string or object array. I believe the object array has functionality around prompt caching, which we haven't yet implemented. If you want to take a stab at doing that and bump the major version I'm all for it.

madebywelch avatar Sep 17 '24 10:09 madebywelch

https://github.com/madebywelch/anthropic-go/tree/v4.0.0 contains the cache header and config update.

madebywelch avatar Sep 29 '24 02:09 madebywelch

Can you update to go.mod for v4.0.0 to identify itself as /v4? @madebywelch

kb-sp avatar Sep 30 '24 03:09 kb-sp

Also, as I'm prepping for this, implementing the cache control on "user" messages becomes awkward when you have conversation history.

For a RAG case, ideally I'd like to store the content as an element in the SystemPrompt (like Anthropic's docs on prompt caching) so that user/assistant pairs are consistent.

For a first prompt, I have:

  • SystemPrompt = "Coaching for the LLM's role" (like an agent spec)
  • User = []Block{
    • ["...text file contents..."]
    • [Initial prompt] < CACHED >

But for a conversation, I need:

  • SystemPrompt = ditto
  • User = []Block{
    • "...text file contents..."
    • [User[0]] (first "user" from conversation history)
      • The hope is that [User[0]] is identical to [Initial prompt] and will trigger the cache.
  • Assistant = []Block{
    • [Assistant[0]] (first "assistant from conversation history)
  • User = []Block{
    • [New prompt]

I think the above works (can't test caching with v4 yet).

But the SystemPrompt approach seems a little more straightforward and predictable:

  • SystemPrompt = []Block{
    • [Coaching]
    • ["...text file contents..."] < CACHED >
  • User = []Block{
    • [Initial prompt]

Then:

  • SystemPrompt = ditto
  • User = []Block{
    • [User[0]] (aka Initial prompt)
  • Assistant = []Block{
    • [Assistant[0]]
  • User = []Block{
    • [New prompt]

In other words, I don't have to hope that [Initial prompt] and [User[0]] from the history happen to be identical. Does that make sense?

kb-sp avatar Sep 30 '24 07:09 kb-sp

@kb-sp Got it. Good idea. I will upgrade SystemPrompt into array of block instead of string.

madebywelch avatar Sep 30 '24 12:09 madebywelch