tiktoken icon indicating copy to clipboard operation
tiktoken copied to clipboard

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Results 87 tiktoken issues
Sort by recently updated
recently updated
newest added
trafficstars

Starting on January 2nd 2025 we started noticing errors in our logs that we were over the context limit when creating text-embedding-3-large embeddings on openai. I believe there may have...

Hi, we are using tiktoken 0.5.2 to calculate the token count. It is working most of the time, but we are seeing the error: RuntimeError(StackOverflow) with one of the runs....

https://openai.com/index/gpt-4-1/

At this point, it is just wrong for OpenAI to release models without updating `tiktoken` at the same time.

Hi TikToken team! 👋 I wanted to share a community resource that might be helpful for TikToken users who also work with HuggingFace tokenizers. I've created AutoTikTokenizer, a lightweight library...

### Summary This PR replaces the use of `hashlib.sha1` with `hashlib.sha256` in `read_file_cached()`. ### Motivation While SHA-1 is used here only for generating a deterministic cache key (not cryptographic operations),...

Adds for the `o4-` to the `MODEL_PREFIX_TO_ENCODING` dictionary and `o4` to `MODEL_TO_ENCODING`. `4.1` has been added in [another PR ](https://github.com/openai/tiktoken/pull/396)

Replaced the hardcoded URL `https://openaipublic.blob.core.windows.net` with the `TIKTOKEN_BPE_HOST` environment variable, allowing for flexibility in sourcing BPE data. This change is particularly beneficial for environments where external access is restricted, such...

https://openai.com/index/introducing-gpt-4-5/

Hi all, I am now attempting to deploy an app that utilises TikToken's encoding. I try-except tik-token's encoding_for_model method with an Azure OpenAI model and if that doesn't work I...