llama_index
llama_index copied to clipboard
cannot pickle 'builtins.CoreBPE' object
The example Chatbot_SEC.ipynb fails with the following error (I just replaced gpt_index. with llama_index. for the imports):
ValidationError Traceback (most recent call last) /tmp/ipykernel_3445500/2033794993.py in <cell line: 22>() 20 ) 21 ---> 22 toolkit = LlamaToolkit( 23 index_configs=index_configs, 24 graph_configs=[graph_config]
/opt/conda/lib/python3.9/site-packages/pydantic/main.cpython-39-x86_64-linux-gnu.so in pydantic.main.BaseModel.init()
ValidationError: 5 validation errors for LlamaToolkit index_configs -> 0 cannot pickle 'builtins.CoreBPE' object (type=type_error) index_configs -> 1 cannot pickle 'builtins.CoreBPE' object (type=type_error) index_configs -> 2 cannot pickle 'builtins.CoreBPE' object (type=type_error) index_configs -> 3 cannot pickle 'builtins.CoreBPE' object (type=type_error) graph_configs -> 0 cannot pickle 'builtins.CoreBPE' object (type=type_error)
I tried it on two different environments, and the result was the same.
ChatGPT says that this is some package-level variable from HuggingFace...
Opening the pickle:
{'fail': IndexToolConfig(index=<llama_index.indices.vector_store.vector_indices.GPTSimpleVectorIndex object at 0x000001EAD60244F0>, name='Vector Index for Micha Josef Berdyczewski', description='useful for when you want to answer queries about the Micha Josef Berdyczewski', index_query_kwargs={'similarity_top_k': 3}, tool_kwargs={'return_direct': True}), 'err': TypeError("cannot pickle 'builtins.CoreBPE' object"), 'depth': 10, 'failing_children': [{'fail': <llama_index.indices.vector_store.vector_indices.GPTSimpleVectorIndex at 0x1ead60244f0>, 'err': TypeError("cannot pickle 'builtins.CoreBPE' object"), 'depth': 9, 'failing_children': [{'fail': <llama_index.embeddings.openai.OpenAIEmbedding at 0x1ead9935070>, 'err': TypeError("cannot pickle 'builtins.CoreBPE' object"), 'depth': 8, 'failing_children': [{'fail': <bound method Encoding.encode of <Encoding 'gpt2'>>, 'err': TypeError("cannot pickle 'builtins.CoreBPE' object"), 'depth': 7, 'failing_children': []}]}, {'fail': <llama_index.indices.prompt_helper.PromptHelper at 0x1ead930a0a0>, 'err': TypeError("cannot pickle 'builtins.CoreBPE' object"), 'depth': 8, 'failing_children': [{'fail': <bound method Encoding.encode of <Encoding 'gpt2'>>, 'err': TypeError("cannot pickle 'builtins.CoreBPE' object"), 'depth': 7, 'failing_children': []}]}, {'fail': <llama_index.langchain_helpers.text_splitter.TokenTextSplitter at 0x1ead9935220>, 'err': TypeError("cannot pickle 'builtins.CoreBPE' object"), 'depth': 8, 'failing_children': [{'fail': <bound method Encoding.encode of <Encoding 'gpt2'>>, 'err': TypeError("cannot pickle 'builtins.CoreBPE' object"), 'depth': 7, 'failing_children': []}]}]}]}
@vadimber, does the example work using the gpt-index imports?
As I noted at the start of the issue - I replaced gpt_index. with llama_index. for the imports
I just tried to install gpt-index-0.4.38 and to replace all llama_index with gpt_index - the result is precisely the same
This should be working with the latest version of llama-index (0.6.20). Going to close for now, feel free to re-open if needed
@logan-markewich I've faced the same issue here. I tried to load some indexes from documents in a parallel way using multiprocessing.Pool, and I'm facing the issue I cannot load the indexes due to this TypeError
.
Don't know if it's something relevant to reopen the issue, since it's happening only when I try to multiprocess it.
I'm running on version 0.6.30
@LucasMallmann yea multiprocessing needs to pickle the outputs of the function, but it looks like you can't pickle the vector store index object.
I don't think this will be easily fixable lol but I'm also not too familiar with the error happening here
I am getting this issue while trying to pickle the data agent. I am not sure if there is a different way to preserve an agent between api calls aside from pickling and restoring. If so I'd love to know about it.
agent = OpenAIAgent.from_tools(
[
*medical_spec.to_tool_list(),
conversations_10k_tool,
],
llm=llm,
verbose=True,
)
# do some chatting with the agent
...
# then try to pickle the agent
Then I get TypeError: cannot pickle 'builtins.CoreBPE' object
Related: https://github.com/jerryjliu/llama_index/issues/7169