Tiktoken icon indicating copy to clipboard operation
Tiktoken copied to clipboard

Generate/load Encoder from tokenizer.json file

Open michalblaha opened this issue 7 months ago • 1 comments

What would you like to be added:

It would be great to generate/load encoder from tokenizer.json file like https://huggingface.co/CohereForAI/aya-101/resolve/main/tokenizer.json or https://huggingface.co/openai-community/gpt2/raw/main/tokenizer.json

Why is this needed:

Easy use of specific tokenizer for specific (mostly open source) models

Anything else we need to know?

michalblaha avatar Jul 09 '24 12:07 michalblaha