DB-GPT icon indicating copy to clipboard operation
DB-GPT copied to clipboard

[BUG]: 新增知识库经常出现问题

Open ZERO-A-ONE opened this issue 1 year ago • 4 comments

经常添加为新知识库并上传文件后

2023-05-20 16:21:15 | ERROR | stderr | Traceback (most recent call last):
2023-05-20 16:21:15 | ERROR | stderr |   File "/root/miniconda3/envs/dbgpt_env/lib/python3.10/site-packages/gradio/routes.py", line 394, in run_predict
2023-05-20 16:21:15 | ERROR | stderr |     output = await app.get_blocks().process_api(
2023-05-20 16:21:15 | ERROR | stderr |   File "/root/miniconda3/envs/dbgpt_env/lib/python3.10/site-packages/gradio/blocks.py", line 1075, in process_api
2023-05-20 16:21:15 | ERROR | stderr |     result = await self.call_function(
2023-05-20 16:21:15 | ERROR | stderr |   File "/root/miniconda3/envs/dbgpt_env/lib/python3.10/site-packages/gradio/blocks.py", line 884, in call_function
2023-05-20 16:21:15 | ERROR | stderr |     prediction = await anyio.to_thread.run_sync(
2023-05-20 16:21:15 | ERROR | stderr |   File "/root/miniconda3/envs/dbgpt_env/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
2023-05-20 16:21:15 | ERROR | stderr |     return await get_asynclib().run_sync_in_worker_thread(
2023-05-20 16:21:15 | ERROR | stderr |   File "/root/miniconda3/envs/dbgpt_env/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
2023-05-20 16:21:15 | ERROR | stderr |     return await future
2023-05-20 16:21:15 | ERROR | stderr |   File "/root/miniconda3/envs/dbgpt_env/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
2023-05-20 16:21:15 | ERROR | stderr |     result = context.run(func, *args)
2023-05-20 16:21:15 | ERROR | stderr |   File "/root/DB-GPT/pilot/server/webserver.py", line 621, in knowledge_embedding_store
2023-05-20 16:21:15 | ERROR | stderr |     knowledge_embedding_client.knowledge_embedding()
2023-05-20 16:21:15 | ERROR | stderr |   File "/root/DB-GPT/pilot/source_embedding/knowledge_embedding.py", line 28, in knowledge_embedding
2023-05-20 16:21:15 | ERROR | stderr |     self.knowledge_embedding_client.source_embedding()
2023-05-20 16:21:15 | ERROR | stderr |   File "/root/DB-GPT/pilot/source_embedding/source_embedding.py", line 70, in source_embedding
2023-05-20 16:21:15 | ERROR | stderr |     text = self.read()
2023-05-20 16:21:15 | ERROR | stderr |   File "/root/DB-GPT/pilot/source_embedding/pdf_embedding.py", line 27, in read
2023-05-20 16:21:15 | ERROR | stderr |     return loader.load_and_split(textsplitter)
2023-05-20 16:21:15 | ERROR | stderr |   File "/root/miniconda3/envs/dbgpt_env/lib/python3.10/site-packages/langchain/document_loaders/base.py", line 25, in load_and_split
2023-05-20 16:21:15 | ERROR | stderr |     docs = self.load()
2023-05-20 16:21:15 | ERROR | stderr |   File "/root/miniconda3/envs/dbgpt_env/lib/python3.10/site-packages/langchain/document_loaders/pdf.py", line 99, in load
2023-05-20 16:21:15 | ERROR | stderr |     pdf_reader = pypdf.PdfReader(pdf_file_obj)
2023-05-20 16:21:15 | ERROR | stderr |   File "/root/miniconda3/envs/dbgpt_env/lib/python3.10/site-packages/pypdf/_reader.py", line 322, in __init__
2023-05-20 16:21:15 | ERROR | stderr |     self.read(stream)
2023-05-20 16:21:15 | ERROR | stderr |   File "/root/miniconda3/envs/dbgpt_env/lib/python3.10/site-packages/pypdf/_reader.py", line 1505, in read
2023-05-20 16:21:15 | ERROR | stderr |     self._basic_validation(stream)
2023-05-20 16:21:15 | ERROR | stderr |   File "/root/miniconda3/envs/dbgpt_env/lib/python3.10/site-packages/pypdf/_reader.py", line 1550, in _basic_validation
2023-05-20 16:21:15 | ERROR | stderr |     raise EmptyFileError("Cannot read an empty file")
2023-05-20 16:21:15 | ERROR | stderr | pypdf.errors.EmptyFileError: Cannot read an empty file

ZERO-A-ONE avatar May 20 '23 08:05 ZERO-A-ONE

等本地文件上传完成,在点击加载。 你肯定点太快了~

csunny avatar May 20 '23 08:05 csunny

检查已经存在模型,可还是报错:2023-05-20 16:41:51 | WARNING | sentence_transformers.SentenceTransformer | No sentence-transformers model found with name /root/DB-GPT/models/text2vec-large-chinese. Creating a new one with MEAN pooling.

ZERO-A-ONE avatar May 20 '23 08:05 ZERO-A-ONE

等本地文件上传完成,在点击加载。 你肯定点太快了~

清理过会发送消息会报错

2023-05-20 17:36:56 | ERROR | stderr | Traceback (most recent call last):
2023-05-20 17:36:56 | ERROR | stderr |   File "/root/miniconda3/envs/dbgpt_env/lib/python3.10/site-packages/gradio/routes.py", line 394, in run_predict
2023-05-20 17:36:56 | ERROR | stderr |     output = await app.get_blocks().process_api(
2023-05-20 17:36:56 | ERROR | stderr |   File "/root/miniconda3/envs/dbgpt_env/lib/python3.10/site-packages/gradio/blocks.py", line 1075, in process_api
2023-05-20 17:36:56 | ERROR | stderr |     result = await self.call_function(
2023-05-20 17:36:56 | ERROR | stderr |   File "/root/miniconda3/envs/dbgpt_env/lib/python3.10/site-packages/gradio/blocks.py", line 884, in call_function
2023-05-20 17:36:56 | ERROR | stderr |     prediction = await anyio.to_thread.run_sync(
2023-05-20 17:36:56 | ERROR | stderr |   File "/root/miniconda3/envs/dbgpt_env/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
2023-05-20 17:36:56 | ERROR | stderr |     return await get_asynclib().run_sync_in_worker_thread(
2023-05-20 17:36:56 | ERROR | stderr |   File "/root/miniconda3/envs/dbgpt_env/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
2023-05-20 17:36:56 | ERROR | stderr |     return await future
2023-05-20 17:36:56 | ERROR | stderr |   File "/root/miniconda3/envs/dbgpt_env/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
2023-05-20 17:36:56 | ERROR | stderr |     result = context.run(func, *args)
2023-05-20 17:36:56 | ERROR | stderr |   File "/root/DB-GPT/pilot/server/webserver.py", line 171, in add_text
2023-05-20 17:36:56 | ERROR | stderr |     state.append_message(state.roles[0], text)
2023-05-20 17:36:56 | ERROR | stderr | AttributeError: 'NoneType' object has no attribute 'append_message'

ZERO-A-ONE avatar May 20 '23 09:05 ZERO-A-ONE

我是先把文件拷贝到datasets/oceanbase目录下,替代了原有的文件。然后启动webserver。我想测试下,把默认的知识库改掉,是不是可以工作。

xuji755 avatar May 21 '23 01:05 xuji755