magentic icon indicating copy to clipboard operation
magentic copied to clipboard

Support for prompt caching

Open ashwin153 opened this issue 10 months ago • 5 comments

  • I want to send cache_control={"type": "ephemeral"} when using Anthropic models and magentic.Chat.
  • I think a potential workaround is to create a CacheControlMessage that subclasses magentic.SystemMessage.
  • What do you think of this workaround?
  • Is this something you think should be better supported by magentic?

ashwin153 avatar Apr 14 '25 14:04 ashwin153

Maybe if model_config = pydantic.ConfigDict(extra='allow') was set on Message, I could do magentic.UserMessage("...", cache_control={"type": "ephemeral"}) and then add logic to my ChatModel.complete and ChatModel.acomplete to serialize this extra field in my Anthropic requests?

ashwin153 avatar Apr 14 '25 16:04 ashwin153

@jackmpcollins My bad I accidentally closed this can you repopen it?

ashwin153 avatar Apr 14 '25 16:04 ashwin153

import magentic.chat_model.message

# All messages before a cache breakpoint in a chat are eligible for prompt caching by Anthropic.
CACHE_BREAKPOINT = magentic.chat_model.message._RawMessage(
    {
        "cache_control": {"type": "ephemeral"},
        "content": "Remember everything I've told you up to this point.",
        "role": "system",
    },
)

here's my workaround

ashwin153 avatar Apr 14 '25 17:04 ashwin153

Hi @ashwin153 Your approach sounds right to me! I think magentic could have custom Message objects for different providers to support the params that they differ on. So in this case you could make a new message class

from magentic import AssistantMessage
from magentic.chat_model.message import ContentT

class AnthropicAssistantMessage(AssistantMessage[ContentT], Generic[ContentT]):
    cache_control: Literal["ephemeral", ...] | None = None

and you can register how to convert this magentic message into an anthropic message param

from anthropic.types import MessageParam
from magentic.chat_model.anthropic_chat_model import message_to_anthropic_message

@message_to_anthropic_message.register(AnthropicAssistantMessage)
def _(message: AnthropicAssistantMessage[Any]) -> MessageParam:
    # Need to call the implementation of this function for AssistantMessage to reuse it
    # I'm not sure how to do that off the top of my head, so inventing "impl_for"
    # Another option is to copy all the code from the existing implementation again here
    message_param = message_to_anthropic_message.impl_for(AssistantMessage)(message)
    # Add the extra field
    if message.cache_control is not None:
        message_param["cache_control"] = message.cache_control
    return message_param

see: https://github.com/jackmpcollins/magentic/blob/50c7aeef41b4c500698043ae5d634c2b3e4d0a7c/src/magentic/chat_model/anthropic_chat_model.py#L163-L210

jackmpcollins avatar Apr 14 '25 18:04 jackmpcollins

Actually seeing now that this is in the system section of the Anthropic request, so it needs an update in magentic itself. https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching

You could create an AnthropicSystemMessage, no need to register it since it will just be treated the same as SystemMessage by default, but in AnthropicChatModel the _extract_system_message function would need to be updated to extract multiple messages and format these appropriately.

https://github.com/jackmpcollins/magentic/blob/50c7aeef41b4c500698043ae5d634c2b3e4d0a7c/src/magentic/chat_model/anthropic_chat_model.py#L445

jackmpcollins avatar Apr 14 '25 18:04 jackmpcollins