tiktoken icon indicating copy to clipboard operation
tiktoken copied to clipboard

Is there a way for tiktoken to interoperate better with offline AI software?

Open ParetoOptimalDev opened this issue 2 years ago • 5 comments

For instance there are bug reports from users trying to run software in offline only mode, but because those libraries use tiktoken and it goes out to download vocab files, those users get an error like:

  • https://github.com/openai/whisper/discussions/1399 (fix consists of downloading files to cache, pip installing something)
  • https://github.com/Significant-Gravitas/AutoGPT/issues/1909
  • https://github.com/imartinez/privateGPT/issues/1458

In that last issue for example the issue was:

  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/tiktoken_ext/openai_public.py", line 11, in gpt2
    mergeable_ranks = data_gym_to_mergeable_bpe_ranks(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/tiktoken/load.py", line 82, in data_gym_to_mergeable_bpe_ranks
    vocab_bpe_contents = read_file_cached(vocab_bpe_file).decode()
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Perhaps tiktoken could respect an environmental variable like OFFLINE similar to TERM=dumb for terminals and throw an error of vocab file.xyz not present, not downloading because OFFLINE=1 environmental variable set?

Thanks!

ParetoOptimalDev avatar Dec 27 '23 19:12 ParetoOptimalDev

Same question

how to use it offline

jimmy-evo avatar Jan 05 '24 06:01 jimmy-evo

https://stackoverflow.com/questions/76106366/how-to-use-tiktoken-in-offline-mode-computer

I found this

jimmy-evo avatar Jan 05 '24 06:01 jimmy-evo

https://stackoverflow.com/questions/76106366/how-to-use-tiktoken-in-offline-mode-computer

I found this

That solution works. Tested it myself.

Thank you for finding it.

ForkInABlender avatar Mar 11 '24 06:03 ForkInABlender