gpt-tokenizer icon indicating copy to clipboard operation
gpt-tokenizer copied to clipboard

JavaScript BPE Tokenizer Encoder Decoder for OpenAI's GPT-2 / GPT-3 / GPT-4. Port of OpenAI's tiktoken with additional features.

Results 24 gpt-tokenizer issues
Sort by recently updated
recently updated
newest added

As listed in the [`README.md`](https://github.com/niieani/gpt-tokenizer#supported-models-and-their-encodings), there is no support for the new models: * `gpt-3.5-turbo-0613` * `gpt-3.5-turbo-16k-0613`

This PR integrates the new tokenizer for gpt-4o `o200k_base`. Encoding file: [https://openaipublic.blob.core.windows.net/encodings/o200k_base.tiktoken 214](https://openaipublic.blob.core.windows.net/encodings/o200k_base.tiktoken) Closes #42

https://github.com/openai/tiktoken/commit/c0ba74c238d18b4824c25f3c27fc8698055b9a76