DB-GPT icon indicating copy to clipboard operation
DB-GPT copied to clipboard

用自定义文件替换原有oceanbase知识库报错

Open xuji755 opened this issue 2 years ago • 1 comments

我将datasets中的Oceanbase知识库删除,加入了自定义的知识库文件。矢量化的时候报错: 2023-05-18 17:13:25 | INFO | sentence_transformers.SentenceTransformer | Load pretrained SentenceTransformer: /home/dbgpt/DB-GPT/models/all-MiniLM-L6-v2 2023-05-18 17:13:26 | INFO | sentence_transformers.SentenceTransformer | Use pytorch device: cuda 2023-05-18 17:13:26 | INFO | stdout | 向量数据库持久化地址: /home/dbgpt/DB-GPT/pilot/vector_store/.vectordb 2023-05-18 17:13:26 | INFO | unstructured | Reading document from string ... 2023-05-18 17:13:26 | INFO | unstructured | Reading document ... 2023-05-18 17:13:26 | INFO | stdout | 文档2向量初始化中, 请稍等... {'source': '/oracle/oracle.md'} 2023-05-18 17:13:26 | INFO | chromadb.telemetry.posthog | Anonymized telemetry enabled. See https://docs.trychroma.com/telemetry for more information. 2023-05-18 17:13:26 | INFO | chromadb | Running Chroma using direct local API. 2023-05-18 17:13:26 | WARNING | chromadb | Using embedded DuckDB with persistence: data will be stored in: /home/dbgpt/DB-GPT/pilot/vector_store/.vectordb 2023-05-18 17:13:26 | INFO | clickhouse_connect.driver.ctypes | Successfully imported ClickHouse Connect C data optimizations 2023-05-18 17:13:26 | INFO | clickhouse_connect.driver.ctypes | Successfully import ClickHouse Connect C/Numpy optimizations 2023-05-18 17:13:26 | INFO | clickhouse_connect.json_impl | Using orjson library for writing JSON byte strings 2023-05-18 17:13:26 | INFO | chromadb.db.duckdb | No existing DB found in /home/dbgpt/DB-GPT/pilot/vector_store/.vectordb, skipping load 2023-05-18 17:13:26 | INFO | chromadb.db.duckdb | No existing DB found in /home/dbgpt/DB-GPT/pilot/vector_store/.vectordb, skipping load Batches: 0%| | 0/1 [00:00<?, ?it/s] Batches: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 1.90it/s] Batches: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 1.89it/s] 2023-05-18 17:13:27 | ERROR | stderr | 2023-05-18 17:13:27 | INFO | chromadb.db.duckdb | Persisting DB to disk, putting it in the save folder: /home/dbgpt/DB-GPT/pilot/vector_store/.vectordb Batches: 0%| | 0/1 [00:00<?, ?it/s] Batches: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 365.77it/s] 2023-05-18 17:13:27 | ERROR | stderr | 2023-05-18 17:13:27 | ERROR | stderr | Traceback (most recent call last): 2023-05-18 17:13:27 | ERROR | stderr | File "/home/dbgpt/.local/lib/python3.10/site-packages/gradio/routes.py", line 394, in run_predict 2023-05-18 17:13:27 | ERROR | stderr | output = await app.get_blocks().process_api( 2023-05-18 17:13:27 | ERROR | stderr | File "/home/dbgpt/.local/lib/python3.10/site-packages/gradio/blocks.py", line 1075, in process_api 2023-05-18 17:13:27 | ERROR | stderr | result = await self.call_function( 2023-05-18 17:13:27 | ERROR | stderr | File "/home/dbgpt/.local/lib/python3.10/site-packages/gradio/blocks.py", line 898, in call_function 2023-05-18 17:13:27 | ERROR | stderr | prediction = await anyio.to_thread.run_sync( 2023-05-18 17:13:27 | ERROR | stderr | File "/home/dbgpt/.local/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync 2023-05-18 17:13:27 | ERROR | stderr | return await get_asynclib().run_sync_in_worker_thread( 2023-05-18 17:13:27 | ERROR | stderr | File "/home/dbgpt/.local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread 2023-05-18 17:13:27 | ERROR | stderr | return await future 2023-05-18 17:13:27 | ERROR | stderr | File "/home/dbgpt/.local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run 2023-05-18 17:13:27 | ERROR | stderr | result = context.run(func, *args) 2023-05-18 17:13:27 | ERROR | stderr | File "/home/dbgpt/.local/lib/python3.10/site-packages/gradio/utils.py", line 549, in async_iteration 2023-05-18 17:13:27 | ERROR | stderr | return next(iterator) 2023-05-18 17:13:27 | ERROR | stderr | File "/home/dbgpt/DB-GPT/pilot/server/webserver.py", line 225, in http_bot 2023-05-18 17:13:27 | ERROR | stderr | state.messages[-2][1] = knqa.get_similar_answer(query) 2023-05-18 17:13:27 | ERROR | stderr | File "/home/dbgpt/DB-GPT/pilot/server/vectordb_qa.py", line 25, in get_similar_answer 2023-05-18 17:13:27 | ERROR | stderr | docs = retriever.get_relevant_documents(query=query) 2023-05-18 17:13:27 | ERROR | stderr | File "/home/dbgpt/.local/lib/python3.10/site-packages/langchain/vectorstores/base.py", line 279, in get_relevant_documents 2023-05-18 17:13:27 | ERROR | stderr | docs = self.vectorstore.similarity_search(query, **self.search_kwargs) 2023-05-18 17:13:27 | ERROR | stderr | File "/home/dbgpt/.local/lib/python3.10/site-packages/langchain/vectorstores/chroma.py", line 138, in similarity_search 2023-05-18 17:13:27 | ERROR | stderr | docs_and_scores = self.similarity_search_with_score(query, k, filter=filter) 2023-05-18 17:13:27 | ERROR | stderr | File "/home/dbgpt/.local/lib/python3.10/site-packages/langchain/vectorstores/chroma.py", line 184, in similarity_search_with_score 2023-05-18 17:13:27 | ERROR | stderr | results = self._collection.query( 2023-05-18 17:13:27 | ERROR | stderr | File "/home/dbgpt/.local/lib/python3.10/site-packages/chromadb/api/models/Collection.py", line 219, in query 2023-05-18 17:13:27 | ERROR | stderr | return self._client._query( 2023-05-18 17:13:27 | ERROR | stderr | File "/home/dbgpt/.local/lib/python3.10/site-packages/chromadb/api/local.py", line 408, in _query 2023-05-18 17:13:27 | ERROR | stderr | uuids, distances = self._db.get_nearest_neighbors( 2023-05-18 17:13:27 | ERROR | stderr | File "/home/dbgpt/.local/lib/python3.10/site-packages/chromadb/db/clickhouse.py", line 583, in get_nearest_neighbors 2023-05-18 17:13:27 | ERROR | stderr | uuids, distances = index.get_nearest_neighbors(embeddings, n_results, ids) 2023-05-18 17:13:27 | ERROR | stderr | File "/home/dbgpt/.local/lib/python3.10/site-packages/chromadb/db/index/hnswlib.py", line 238, in get_nearest_neighbors 2023-05-18 17:13:27 | ERROR | stderr | raise NotEnoughElementsException( 2023-05-18 17:13:27 | ERROR | stderr | chromadb.errors.NotEnoughElementsException: Number of requested results 5 cannot be greater than number of elements in index 1

xuji755 avatar May 18 '23 09:05 xuji755

是最新的代码吗?尝试下拉下最新代码呢?

Aries-ckt avatar May 22 '23 03:05 Aries-ckt