llama_index icon indicating copy to clipboard operation
llama_index copied to clipboard

docstore-->docs-->__type__: "simple_dict" error

Open BadAstronaut opened this issue 2 years ago • 6 comments

Hi im following the TestEssay.ipynb example but chaning the txt file for a PDF. Im able to do the data ingest to generate the .json file but when i try to query the data im getting this error.

File "C:\Users\rzlaz\OneDrive\Documents\Development\LLM\gpt-oguc-V1\query_data.py", line 16, in <module> new_index = GPTTreeIndex.load_from_disk('index.json') ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\rzlaz\AppData\Local\Programs\Python\Python311\Lib\site-packages\gpt_index\indices\base.py", line 443, in load_from_disk return cls.load_from_string(file_contents, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\rzlaz\AppData\Local\Programs\Python\Python311\Lib\site-packages\gpt_index\indices\base.py", line 415, in load_from_string docstore = DocumentStore.load_from_dict( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\rzlaz\AppData\Local\Programs\Python\Python311\Lib\site-packages\gpt_index\docstore.py", line 61, in load_from_dict raise ValueError( ValueError: doc_type simple_dict not found in type_to_struct. Make sure that it was registered in the index registry.

i get that GPTSimpleVectorIndex is adding an unknown type "simple_dic" but no idea how to fix it

BadAstronaut avatar Feb 10 '23 17:02 BadAstronaut

Hi, could you paste the code? it seems like you're loading a file from disk, how are you doing that?

jerryjliu avatar Feb 11 '23 13:02 jerryjliu

@BadAstronaut bump on this

jerryjliu avatar Mar 06 '23 21:03 jerryjliu

Good morning. I'm also looking at the same/similar error. The type_to_struct dictionary has the value {<IndexStructType.SIMPLE_DICT: 'simple_dict'>: <class 'llama_index.data_structs.data_structs.SimpleIndexDict'>} while the doc_type is 'keyword_table'

This was all resulting from loading a file from disk (albeit very large), saving the index and then loading the index with the GPTSimpleVectorIndex.load_from_disk([index name]) method.

Any insights that you have would be appreciated.

jhesselgesser avatar Mar 08 '23 17:03 jhesselgesser

Hi @jhesselgesser, are you using the simple vector index or a keyword table index? Also, are you using our composability framework?

As a simple test you could try rebuilding your index over a small slice of your data, saving and loading it, on the latest version of llama-index, and let me know if that doesn't work

jerryjliu avatar Mar 08 '23 17:03 jerryjliu

TY @jerryjliu that does work!

jhesselgesser avatar Mar 08 '23 19:03 jhesselgesser

👍 yeah may be due to some slight breaking changes between versions - thanks for surfacing

jerryjliu avatar Mar 08 '23 19:03 jerryjliu

Closing since issue is resolved.

Disiok avatar Mar 16 '23 17:03 Disiok