Mario Klingemann
Mario Klingemann
Sorry for reopening this, but I realize that it is not really clear to me how to correctly count tokens for tool useage in the context of computer use: This...
After a lot of try and error I figured it out - The "secret" is that for tools one has to pass in the assistants "tool_use" and the users(!) "tool_result"...
One more observation - removing tools that are not used in that call from the "tool" list gives a different count: ``` response = client.beta.messages.count_tokens( betas=["token-counting-2024-11-01","computer-use-2024-10-22"], model="claude-3-5-sonnet-20241022", tools=[ { "type":...
I guess it helps to read the documentation (https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching) Interesting: - Cache write tokens are 25% more expensive than base input tokens - Cache read tokens are 90% cheaper than...
``` def _inject_prompt_caching( messages: list[BetaMessageParam], ): """ Set cache breakpoints for the 3 most recent turns one cache breakpoint is left for tools/system prompt, to be shared across sessions """...
Are you sure that the entire conversation up to a breakpoint is cached? Maybe I misunderstand the cached query docs, but it is my impression that only those messages that...
The new token counting endpoint seems to come at the right time for making the caching strategy at least a little bit smarter: https://docs.anthropic.com/en/docs/build-with-claude/token-counting
I believe that the way computer-use currently uses prompt caching actually does not include the tools (whilst some comments inside the source code suggest that it is assumed that this...
In my customized system I have implemented that feature already and yes it is a possibility. In the end you can always ask claude to improve itself - the only...
If you are looking for some code that can push PRs as autonomously to a repo you can check out this prototype (written with Claude): https://github.com/Quasimondo/OpenMender On Sun, Nov 3,...