tiktoken icon indicating copy to clipboard operation
tiktoken copied to clipboard

Is there a way for tiktoken to interoperate better with offline AI software?

Open ParetoOptimalDev opened this issue 1 year ago • 4 comments
trafficstars

For instance there are bug reports from users trying to run software in offline only mode, but because those libraries use tiktoken and it goes out to download vocab files, those users get an error like:

  • https://github.com/openai/whisper/discussions/1399 (fix consists of downloading files to cache, pip installing something)
  • https://github.com/Significant-Gravitas/AutoGPT/issues/1909
  • https://github.com/imartinez/privateGPT/issues/1458

In that last issue for example the issue was:

  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/tiktoken_ext/openai_public.py", line 11, in gpt2
    mergeable_ranks = data_gym_to_mergeable_bpe_ranks(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/tiktoken/load.py", line 82, in data_gym_to_mergeable_bpe_ranks
    vocab_bpe_contents = read_file_cached(vocab_bpe_file).decode()
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Perhaps tiktoken could respect an environmental variable like OFFLINE similar to TERM=dumb for terminals and throw an error of vocab file.xyz not present, not downloading because OFFLINE=1 environmental variable set?

Thanks!

ParetoOptimalDev avatar Dec 27 '23 19:12 ParetoOptimalDev

Same question

how to use it offline

jinmingyi1998 avatar Jan 05 '24 06:01 jinmingyi1998

https://stackoverflow.com/questions/76106366/how-to-use-tiktoken-in-offline-mode-computer

I found this

jinmingyi1998 avatar Jan 05 '24 06:01 jinmingyi1998

https://stackoverflow.com/questions/76106366/how-to-use-tiktoken-in-offline-mode-computer

I found this

That solution works. Tested it myself.

Thank you for finding it.

ForkInABlender avatar Mar 11 '24 06:03 ForkInABlender