Tiktoken
Tiktoken copied to clipboard
Generate/load Encoder from tokenizer.json file
What would you like to be added:
It would be great to generate/load encoder from tokenizer.json file like https://huggingface.co/CohereForAI/aya-101/resolve/main/tokenizer.json or https://huggingface.co/openai-community/gpt2/raw/main/tokenizer.json
Why is this needed:
Easy use of specific tokenizer for specific (mostly open source) models