➜ paul_graham_essay git:(main) ✗ python3 TestEssay.py
Traceback (most recent call last):
File "/Users/fengwei/temp/gpt_index/examples/paul_graham_essay/TestEssay.py", line 42, in
index = GPTTreeIndex(documents)
File "/Users/fengwei/Library/Python/3.9/lib/python/site-packages/llama_index/indices/tree/base.py", line 74, in init
super().init(
File "/Users/fengwei/Library/Python/3.9/lib/python/site-packages/llama_index/indices/base.py", line 83, in init
self._embed_model = embed_model or OpenAIEmbedding()
File "/Users/fengwei/Library/Python/3.9/lib/python/site-packages/llama_index/embeddings/openai.py", line 208, in init
super().init()
File "/Users/fengwei/Library/Python/3.9/lib/python/site-packages/llama_index/embeddings/base.py", line 55, in init
self._tokenizer: Callable = globals_helper.tokenizer
File "/Users/fengwei/Library/Python/3.9/lib/python/site-packages/llama_index/utils.py", line 38, in tokenizer
enc = tiktoken.get_encoding("gpt2")
File "/Users/fengwei/Library/Python/3.9/lib/python/site-packages/tiktoken/registry.py", line 63, in get_encoding
enc = Encoding(**constructor())
File "/Users/fengwei/Library/Python/3.9/lib/python/site-packages/tiktoken_ext/openai_public.py", line 11, in gpt2
mergeable_ranks = data_gym_to_mergeable_bpe_ranks(
File "/Users/fengwei/Library/Python/3.9/lib/python/site-packages/tiktoken/load.py", line 90, in data_gym_to_mergeable_bpe_ranks
assert bpe_ranks == encoder_json_loaded
AssertionError
Looks like an issue in the tiktoken installation. Could you check which version is installed in your environment?
I have met almost the same error, and I use tiktoken 0.3.1
Duplicate of https://github.com/jerryjliu/llama_index/issues/738