tiktoken-go icon indicating copy to clipboard operation
tiktoken-go copied to clipboard

[Efficiency] Inefficient to create encoders multiple times

Open omar-scio opened this issue 8 months ago • 5 comments

I noticed our tests were much slower when switching from https://github.com/tiktoken-go/tokenizer to this library. It seems to be because tiktoken.EncodingForModel is very slow (0.3s on my machine).

Our I see that the source code is caching the encoding itself, but at this point, why not cache the *Tiktoken? That's what we did to fix this in our own codebase, and the latency went away.

I can make a PR for this if there is not much time available for the maintainers.

omar-scio avatar Jun 11 '24 21:06 omar-scio