tiktoken
tiktoken copied to clipboard
Is there a way for tiktoken to interoperate better with offline AI software?
For instance there are bug reports from users trying to run software in offline only mode, but because those libraries use tiktoken and it goes out to download vocab files, those users get an error like:
- https://github.com/openai/whisper/discussions/1399 (fix consists of downloading files to cache, pip installing something)
- https://github.com/Significant-Gravitas/AutoGPT/issues/1909
- https://github.com/imartinez/privateGPT/issues/1458
In that last issue for example the issue was:
File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/tiktoken_ext/openai_public.py", line 11, in gpt2
mergeable_ranks = data_gym_to_mergeable_bpe_ranks(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/tiktoken/load.py", line 82, in data_gym_to_mergeable_bpe_ranks
vocab_bpe_contents = read_file_cached(vocab_bpe_file).decode()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Perhaps tiktoken could respect an environmental variable like OFFLINE similar to TERM=dumb for terminals and throw an error of vocab file.xyz not present, not downloading because OFFLINE=1 environmental variable set?
Thanks!
Same question
how to use it offline
https://stackoverflow.com/questions/76106366/how-to-use-tiktoken-in-offline-mode-computer
I found this
https://stackoverflow.com/questions/76106366/how-to-use-tiktoken-in-offline-mode-computer
I found this
That solution works. Tested it myself.
Thank you for finding it.