gpt-tokenizer
gpt-tokenizer copied to clipboard
JavaScript BPE Tokenizer Encoder Decoder for OpenAI's GPT-2 / GPT-3 / GPT-4. Port of OpenAI's tiktoken with additional features.
Results
24
gpt-tokenizer issues
Sort by
recently updated
recently updated
newest added
As listed in the [`README.md`](https://github.com/niieani/gpt-tokenizer#supported-models-and-their-encodings), there is no support for the new models: * `gpt-3.5-turbo-0613` * `gpt-3.5-turbo-16k-0613`
This PR integrates the new tokenizer for gpt-4o `o200k_base`. Encoding file: [https://openaipublic.blob.core.windows.net/encodings/o200k_base.tiktoken 214](https://openaipublic.blob.core.windows.net/encodings/o200k_base.tiktoken) Closes #42
https://github.com/openai/tiktoken/commit/c0ba74c238d18b4824c25f3c27fc8698055b9a76