graphrag_api icon indicating copy to clipboard operation
graphrag_api copied to clipboard

SearchRunner报错

Open Jinxinxiang5525 opened this issue 1 year ago • 9 comments

请问为什么会报这个错,冒号后面跟着一个list,里面一大串数字,可能是没embedding回来? Traceback (most recent call last): File "/home/jin/jxx/graphrag_api/api.py", line 23, in search() File "/home/jin/jxx/graphrag_api/api.py", line 16, in search search_runner = SearchRunner(root_dir="rag") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/jin/jxx/graphrag_api/graphrag_api/search.py", line 74, in init self.__drift_agent = self.__get__drift_agent() ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/jin/jxx/graphrag_api/graphrag_api/search.py", line 210, in __get__drift_agent config = self._patch_vector_store( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/jin/jxx/graphrag_api/graphrag_api/search.py", line 526, in _patch_vector_store _reports = read_indexer_reports( ^^^^^^^^^^^^^^^^^^^^^ File "/home/jin/anaconda3/envs/rag-app/lib/python3.11/site-packages/graphrag/query/indexer_adapters.py", line 126, in read_indexer_reports return read_community_reports( ^^^^^^^^^^^^^^^^^^^^^^^ File "/home/jin/anaconda3/envs/rag-app/lib/python3.11/site-packages/graphrag/query/input/loaders/dfs.py", line 240, in read_community_reports full_content_embedding=to_optional_list( ^^^^^^^^^^^^^^^^^ File "/home/jin/anaconda3/envs/rag-app/lib/python3.11/site-packages/graphrag/query/input/loaders/utils.py", line 91, in to_optional_list raise TypeError(msg) TypeError: list item has item that is not <class 'float'>: [0.7341635227203369, -0.9428284764289856, 3.2654314041137695, 1.835994005203247, 3.807887077331543, 1.2793670892715454, 0.8283791542053223, 1.0401357412338257, 6.14626407623291, 。。。, 3.9657492637634277] (<class 'list'>)

Jinxinxiang5525 avatar Dec 18 '24 01:12 Jinxinxiang5525

0.3.6这个版本是正常的。0.5.x版本更新之后发现官方有一个bug。我这边要是修复了后边不好同步就暂时没改。等官方改过来之后我再调整适配一下。

nightzjp avatar Dec 18 '24 01:12 nightzjp

ok,那我换0.3.6试一下

Jinxinxiang5525 avatar Dec 18 '24 01:12 Jinxinxiang5525

ok,那我换0.3.6试一下

可以的。这个跟那个官方的差不太多。早期官方没有api。我就简单整理了下。现在官方有了。但是感觉不够好用~~。 我周六日看看最新版的官方解决了没。到时候做下适配。

nightzjp avatar Dec 18 '24 01:12 nightzjp

试了0.3.6也不行,我用ollama跑的,所以graphrag库中的embedding.py和oepnai_embeddings_llm.py我改了,现在是命令行中的python -m graphrag query --root ./ --method local --query 能跑通,您这个indexer = GraphRagIndexer(root="rag"),indexer.run()能跑通。search部分文件应该都找对了,然后lancedb这里生成了这个: [2024-12-18T02:42:18Z WARN lance::dataset] No existing dataset at /home/jin/jxx/graphrag_api/rag/output/20241216-235753/artifacts/lancedb/default-entity-description.lance, it will be created

INFO: Vector Store Args: { "type": "lancedb", "db_uri": "/home/jin/jxx/graphrag_api/rag/output/20241216-235753/artifacts/lancedb", "container_name": "==== REDACTED ====", "overwrite": true } 但他后面就返回来上面那个报错:TypeError: list item has item that is not <class 'float'>: 这是因为我改的embedding吗(但命令行又能跑通),还是说您说的bug就是这块。

Jinxinxiang5525 avatar Dec 18 '24 02:12 Jinxinxiang5525

TypeError: list item has item that is not <class 'float'>: 这个错是官方的,我看已经修复了。我现在也同步修改下我对应的代码。测试没问题我更新下。

nightzjp avatar Dec 18 '24 02:12 nightzjp

哦哦好的,感谢大佬,辛苦!

Jinxinxiang5525 avatar Dec 18 '24 02:12 Jinxinxiang5525

哦哦好的,感谢大佬,辛苦!

0.9.。1.0版本的都有问题(我直接用官方的示例都跑不通)。你可以先用早期版本的。

nightzjp avatar Dec 18 '24 08:12 nightzjp

大佬,为什么我用0.3.5版本,还是报错: File "pyarrow/array.pxi", line 1115, in pyarrow.lib.Array.from_pandas File "pyarrow/array.pxi", line 339, in pyarrow.lib.array File "pyarrow/array.pxi", line 85, in pyarrow.lib._ndarray_to_array File "pyarrow/error.pxi", line 91, in pyarrow.lib.check_status pyarrow.lib.ArrowInvalid: only handle 1-dimensional arrays

Jinxinxiang5525 avatar Dec 26 '24 13:12 Jinxinxiang5525

解决啦,在dataset.py加query=query.flatten()

Jinxinxiang5525 avatar Dec 26 '24 13:12 Jinxinxiang5525