SharpToken
SharpToken copied to clipboard
Anthropic (claude) support
Can we use SharpToken for Anthropic? I could not find if claude is using "cl100k_base" or other encoding
Hello @omri-suissa-clearmash !
Could you share a bit more what is claude or Anthropic? What is there and how it works?
Thanks
@dmitry-brazhenko claude is the LLM of Anthropic (https://www.anthropic.com/). This is what I could find: https://github.com/anthropics/anthropic-sdk-python/blob/e84645b07ca5267066700a104b4d8d6a8da1383d/src/anthropic/_tokenizers.py
Thanks for sharing.
I will check that. Probably they use some already existing encoding (cl100k_base) or just some custom one. I will check.
@dmitry-brazhenko also found this: https://github.com/19h/claude_tokenizer (rust)
Hello @omri-suissa-clearmash !
I checked the algorithm. Seems that there is a difference, but it can be potentially implemented into Sharptoken lib. I will try to do that within a few days