Langchain-Chatchat
Langchain-Chatchat copied to clipboard
往向量库上传文件报错:TypeError: string indices must be integers, not 'str'
问题描述 / Problem Description 在知识库上传csv文件时发生报错
复现问题的步骤 / Steps to Reproduce
- 执行 'python startup -a'
- 点击 '知识库管理'
- 滚动到 '上传文件.'
- 问题出现 /报错 预期的结果 / Expected Result 成功的传入到faiss向量库中
实际结果 / Actual Result
报错信息::
2024-05-07 09:45:16,936 - utils.py[line:95] - ERROR: ConnectError: error when post /knowledge_base/search_docs: [Errno 111] Connection refused
2024-05-07 09:45:16,936 - utils.py[line:95] - ERROR: ConnectError: error when post /knowledge_base/search_docs: [Errno 111] Connection refused
2024-05-07 09:45:16,937 - utils.py[line:95] - ERROR: ConnectError: error when post /knowledge_base/search_docs: [Errno 111] Connection refused
2024-05-07 09:45:16.937 Uncaught app exception
Traceback (most recent call last):
File "/root/miniconda3/envs/lc/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 535, in _run_script
exec(code, module.dict)
File "/mnt/Langchain-Chatchat/webui.py", line 64, in
环境信息 / Environment Information
- langchain-ChatGLM 版本/commit 号:v2.10
- 是否使用 Docker 部署(是/否):否
- 使用的模型(ChatGLM2-6B / Qwen-7B 等):文心一言4.0
- 使用的 Embedding 模型(moka-ai/m3e-base 等):bge-large-zh
- 使用的向量库类型 (faiss / milvus / pg_vector 等): faiss
- 操作系统及版本 / Operating system and version: 阿里云服务器
- Python 版本 / Python version:3.10
- 其他相关环境信息 / Other relevant environment information:
附加信息 / Additional Information 添加与问题相关的任何其他信息 / Add any other information related to the issue.
The error you're encountering, TypeError: string indices must be integers, not 'str'
, suggests that the application is attempting to access elements of a string using string keys, which is only valid for dictionaries or lists. This typically happens when the data structure expected is a list of dictionaries (where each dictionary represents a row from the CSV file, and keys are column headers), but instead, a string or incorrect format is being processed.
To resolve this, ensure your CSV file is correctly parsed into a list of dictionaries. You can use the pandas
library to easily convert a CSV file into this format:
import pandas as pd
# Assuming 'csv_file' is the path to your CSV file
df = pd.read_csv(csv_file)
records = df.to_dict(orient='records')
# 'records' is now a list of dictionaries, each representing a row from the CSV.
After parsing the CSV file into the correct format, you can iterate over records
and access each column value by its header name, which should align with the expected data structure in your application and prevent the TypeError
.
If the issue persists after ensuring the CSV is correctly parsed and the data structure is as expected, it might be necessary to review the specific steps or code handling the upload process for further discrepancies.
Details
Sources
- [BUG] 知识库上传csv没有进入向量库
- TypeError: string indices must be integers
- webui_pages/knowledge_base/knowledge_base.py
- document_loaders/FilteredCSVloader.py
- server/chat/file_chat.py
- server/api.py
- requirements.txt
About Dosu This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
To continue the conversation, mention @dosu.
使用sample数据库能成功吗
使用sample数据库能成功吗
我没有试,这个好像是接口直接就断开啦
你试试本地 命令行跑一下
你试试本地 命令行跑一下
我试过了不行一直报这个错
你的内容是什么格式的
csv格式的
csv是qa两列吗,一般是两列
不是 就一列5万行,都是一些公司名称
那应该户出现这个问题,一列没发做embed,csv是qa对
但是我之前可以穿进去现在也是穿进去了几十个但现在一直传不进去
搞了一个开源,连基本的运行都做不到,唉,这些错误都是直接就能发现的呀,真是看不懂,我也是出了这个问题。 一个128k的模型无限自问自答只能弃用,一个langchain加载个人库报这个错,感觉也完全用不起来
2024-05-27 16:32:22,181 - utils.py[line:95] - ERROR: ReadTimeout: error when post /knowledge_base/create_knowledge_base: timed out
2024-05-27 16:36:43,277 - utils.py[line:95] - ERROR: ReadTimeout: error when post /knowledge_base/search_docs: timed out
2024-05-27 16:36:43.277 Uncaught app exception
Traceback (most recent call last):
File "/root/langchain_pip/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 535, in _run_script
exec(code, module.dict)
File "/sdb/chatgpt/Langchain-Chatchat/webui.py", line 64, in
解决了,需要把知识阈值修改为1.0默认就是这个不能改!!!!!!!
SCORE_THRESHOLD = 1.0 不能修改否则就会报错!!!!!
在哪个文件修改?
在哪个文件修改?
在configs下的kb_config.py
2024-07-18 拉取main分支同样有这个问题,拉取dev分支一切正常了.