tiktoken icon indicating copy to clipboard operation
tiktoken copied to clipboard

[FR] Add `--offline`

Open NightMachinery opened this issue 1 year ago • 3 comments
trafficstars

There are some inconvenient workarounds for using this software without making an internet connection (which adds considerable latency on unstable networks). This use case should see official support. I propose adding the latest versions of the tokenizers to the pip package, and just using them without checking for updates when the user supplies --offline. Of course, the tokenizers can be updated any time the user doesn't use this flag.

To summarize, I propose two changes:

  • Include the needed tokenizers in the initial download, such that after installing the package with pip, it is ready to go.
  • Auto-update these tokenizers whenever the user does NOT supply --offline.

PS: I have skimmed the workaround, and while it works for offline usage, I am not sure it would solve the latency issue on an online machine. The workaround needs lots of manual steps, too, and it's not just some script we can run and be done with it.

Related:

  • https://github.com/openai/tiktoken/issues/279
  • https://github.com/openai/tiktoken/issues/232

NightMachinery avatar Jul 06 '24 23:07 NightMachinery

After taking a look at the code, it seems just setting TIKTOKEN_CACHE_DIR on an online machine should be enough to avoid redownloading these files. But the latency is still high. I use this script:

#!/usr/bin/env python3
##
import os
import tiktoken
import sys

##
HOME = os.environ["HOME"]

tiktoken_cache_dir = f"{HOME}/tmp/tiktoken_cache"

os.environ["TIKTOKEN_CACHE_DIR"] = tiktoken_cache_dir
##

def num_tokens_from_message(message, model="gpt-4"):
    encoding = tiktoken.encoding_for_model(model)
    num_tokens = len(encoding.encode(message))
    return num_tokens


message = sys.stdin.read()

num_tokens = num_tokens_from_message(message, "gpt-4")

print(num_tokens)

And I have ~0.26s latency for counting a single token!

❯ echo | time openai_token_count.py
1
openai_token_count.py  0.28s user 0.04s system 94% cpu 0.339 total; max RSS 93200

This kind of latency is terrible.

NightMachinery avatar Jul 06 '24 23:07 NightMachinery

@NightMachinery The issue of .tiktoken file getting deleted on Linux has been fixed. Now, I'm able to use tiktoken in offline mode by setting the environment variable TIKTOKEN_CACHE_DIR.

import os
os.environ["TIKTOKEN_CACHE_DIR"] = "path/to/tiktoken_dir"

# rest of tiktoken code

nkilm avatar Aug 22 '24 16:08 nkilm

@nkilm Please report latency. The latency is still terrible even with the file not deleted, isn't it?

NightMachinery avatar Aug 23 '24 07:08 NightMachinery