Different llm with diiferent errors
If using deepseek-r1-32b , the error is "dimension mismatch". If using deepseek-r1-7b , the error is "SyntaxError: invalid character ',' (U+FF0C)".
from deepsearcher.configuration import Configuration, init_config
from deepsearcher.online_query import query
config = Configuration()
# Customize your config here,
# more configuration see the Configuration Details section below.
config.set_provider_config("llm", "OpenAI", {"model": "deepseek-r1-32b", "base_url": "http://192.168.23.10/v1"})
config.set_provider_config("embedding", "OpenAIEmbedding", {"model": "bge-m3", "base_url": "http://192.168.23.10/v1","dimension": 1024})
init_config(config = config)
# Load your local data
from deepsearcher.offline_loading import load_from_local_files
load_from_local_files("/tmp/cl.txt")
# Query
result = query("出差住亲戚家可以报销多少") # Your question here
root@df017718c0f5:/deep-searcher# python mytest.py
Loading files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 17.54it/s]
Embedding chunks: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00, 1.93s/it]
<think> Select agent [ChainOfRAG] to answer the query [出差住亲戚家可以报销多少] </think>
>> Iteration: 1
<think> Perform search [<think>
好吧,用户的问题是关于出差住在亲戚家可以报销多少。首先,我得弄清楚中国的出差住宿报销政策是什么样的。一般来说,报销是根据公司的政策和当地的住宿标准来定的,但住亲戚家可能不算在内。所以,我需要知道用户的具体情况,比如他们所在的城市,这样才能查找当地的住宿标准。
然后,用户可能想知道住亲戚家是否有报销的可能性,或者是否有其他报销方式。比如,如果住在亲戚家,是否可以申请其他形式的补贴,或者是否有例外情况可以报销。所以,问清楚是否住在亲戚家附近,或者是否有其他需求,可以帮助更好地回答。
最后,用户可能希望得到明确的数字,比如具体的报销金额。所以,了解这些细节有助于给出准确的答案。
</think>
出差住宿的具体城市是哪里?] on the vector DB collections: ['deepsearcher'] </think>
<search> Search [<think>
好吧,用户的问题是关于出差住在亲戚家可以报销多少。首先,我得弄清楚中国的出差住宿报销政策是什么样的。一般来说,报销是根据公司的政策和当地的住宿标准来定的,但住亲戚家可能不算在内。所以,我需要知道用户的具体情况,比如他们所在的城市,这样才能查找当地的住宿标准。
然后,用户可能想知道住亲戚家是否有报销的可能性,或者是否有其他报销方式。比如,如果住在亲戚家,是否可以申请其他形式的补贴,或者是否有例外情况可以报销。所以,问清楚是否住在亲戚家附近,或者是否有其他需求,可以帮助更好地回答。
最后,用户可能希望得到明确的数字,比如具体的报销金额。所以,了解这些细节有助于给出准确的答案。
</think>
出差住宿的具体城市是哪里?] in [deepsearcher]... </search>
2025-03-06 07:03:51,698 [ERROR][handler]: RPC error: [search], <MilvusException: (code=2000, message=vector dimension mismatch, expected vector size(byte) 6144, actual 1024.: segcore error)>, <Time:{'RPC start': '2025-03-06 07:03:51.496238', 'RPC error': '2025-03-06 07:03:51.698903'}> (decorators.py:140)
2025-03-06 07:03:51,699 [ERROR][search]: Failed to search collection: deepsearcher (milvus_client.py:414)
2025-03-06 07:03:51,699 - CRITICAL - fail to search data, error info: <MilvusException: (code=2000, message=vector dimension mismatch, expected vector size(byte) 6144, actual 1024.: segcore error)>
Traceback (most recent call last):
File "/deep-searcher/deepsearcher/vector_db/milvus.py", line 113, in search_data
search_results = self.client.search(
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/pymilvus/milvus_client/milvus_client.py", line 415, in search
raise ex from ex
File "/usr/local/lib/python3.11/site-packages/pymilvus/milvus_client/milvus_client.py", line 400, in search
res = conn.search(
^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/pymilvus/decorators.py", line 141, in handler
raise e from e
File "/usr/local/lib/python3.11/site-packages/pymilvus/decorators.py", line 137, in handler
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/pymilvus/decorators.py", line 176, in handler
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/pymilvus/decorators.py", line 116, in handler
raise e from e
File "/usr/local/lib/python3.11/site-packages/pymilvus/decorators.py", line 86, in handler
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/pymilvus/client/grpc_handler.py", line 836, in search
return self._execute_search(request, timeout, round_decimal=round_decimal, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/pymilvus/client/grpc_handler.py", line 777, in _execute_search
raise e from e
File "/usr/local/lib/python3.11/site-packages/pymilvus/client/grpc_handler.py", line 766, in _execute_search
check_status(response.status)
File "/usr/local/lib/python3.11/site-packages/pymilvus/client/utils.py", line 64, in check_status
raise MilvusException(status.code, status.reason, status.error_code)
pymilvus.exceptions.MilvusException: <MilvusException: (code=2000, message=vector dimension mismatch, expected vector size(byte) 6144, actual 1024.: segcore error)>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/deep-searcher/mytest.py", line 22, in <module>
result = query("出差住亲戚家可以报销多少") # Your question here
^^^^^^^^^^^^^^^^^^^^^
File "/deep-searcher/deepsearcher/online_query.py", line 10, in query
return default_searcher.query(original_query, max_iter=max_iter)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/deep-searcher/deepsearcher/agent/rag_router.py", line 69, in query
answer, retrieved_results, n_token_retrieval = agent.query(query, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/deep-searcher/deepsearcher/agent/chain_of_rag.py", line 196, in query
all_retrieved_results, n_token_retrieval, additional_info = self.retrieve(query, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/deep-searcher/deepsearcher/agent/chain_of_rag.py", line 178, in retrieve
intermediate_answer, retrieved_results, n_token1 = self._retrieve_and_answer(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/deep-searcher/deepsearcher/agent/chain_of_rag.py", line 124, in _retrieve_and_answer
retrieved_results = self.vector_db.search_data(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/deep-searcher/deepsearcher/vector_db/milvus.py", line 133, in search_data
log.critical(f"fail to search data, error info: {e}")
File "/deep-searcher/deepsearcher/tools/log.py", line 89, in critical
raise RuntimeError(message)
RuntimeError: fail to search data, error info: <MilvusException: (code=2000, message=vector dimension mismatch, expected vector size(byte) 6144, actual 1024.: segcore error)>
If using deekseek-r1-7b , the error is
root@df017718c0f5:/deep-searcher# python mytest.py
Loading files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 21.01it/s]
Embedding chunks: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:02<00:00, 2.05s/it]
<think> Select agent [ChainOfRAG] to answer the query [出差住亲戚家可以报销多少] </think>
>> Iteration: 1
Traceback (most recent call last):
File "/deep-searcher/deepsearcher/llm/base.py", line 44, in literal_eval
result = ast.literal_eval(response_content.strip())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/ast.py", line 64, in literal_eval
node_or_string = parse(node_or_string.lstrip(" \t"), mode='eval')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/ast.py", line 50, in parse
return compile(source, filename, mode, flags,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<unknown>", line 2
好,我现在要帮助用户解决关于出差住亲戚家可以报销多少的问题。首先,我需要明确用户的主要需求是什么。用户想知道在住亲戚家的情况下,出差可以报销的金额是多少。这可能涉及到公司或单位的报销政策,或者是个人财务方面的安排。
^
SyntaxError: invalid character ',' (U+FF0C)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/deep-searcher/mytest.py", line 22, in <module>
result = query("出差住亲戚家可以报销多少") # Your question here
^^^^^^^^^^^^^^^^^^^^^
File "/deep-searcher/deepsearcher/online_query.py", line 10, in query
return default_searcher.query(original_query, max_iter=max_iter)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/deep-searcher/deepsearcher/agent/rag_router.py", line 69, in query
answer, retrieved_results, n_token_retrieval = agent.query(query, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/deep-searcher/deepsearcher/agent/chain_of_rag.py", line 196, in query
all_retrieved_results, n_token_retrieval, additional_info = self.retrieve(query, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/deep-searcher/deepsearcher/agent/chain_of_rag.py", line 178, in retrieve
intermediate_answer, retrieved_results, n_token1 = self._retrieve_and_answer(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/deep-searcher/deepsearcher/agent/chain_of_rag.py", line 115, in _retrieve_and_answer
selected_collections, n_token_route = self.collection_router.invoke(query=query)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/deep-searcher/deepsearcher/agent/collection_router.py", line 42, in invoke
selected_collections = self.llm.literal_eval(chat_response.content)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/deep-searcher/deepsearcher/llm/base.py", line 49, in literal_eval
raise ValueError(
ValueError: Invalid JSON/List format for response content:
<think>
好,我现在要帮助用户解决关于出差住亲戚家可以报销多少的问题。首先,我需要明确用户的主要需求是什么。用户想知道在住亲戚家的情况下,出差可以报销的金额是多少。这可能涉及到公司或单位的报销政策,或者是个人财务方面的安排。
接下来,我应该考虑用户可能遇到的具体情况。例如,用户可能想知道报销的具体标准,是按天计算,还是根据一定的费用比例来计算。还可能涉及到是否有额外的开销需要报销,比如交通费、住宿费等。
为了更准确地回答这个问题,我需要了解哪些具体的信息。这可能包括用户的工作性质,因为不同类型的公司报销标准可能不同。此外,用户住亲戚家的时间有多长,以及他们在这期间的具体消费情况,比如是否有餐饮费、住宿费的详细记录。
因此,我应该提出一个简单明了的询问,直接询问用户出差住亲戚家的总天数。这样可以帮助我更好地计算报销金额,并根据实际情况提供更精确的答案。如果用户能提供天数,我就可以参考相关的报销标准,如每天多少元,或者根据实际发生的费用来计算。
总结一下,我需要设计一个简单的问题,询问用户出差住亲戚家的总天数,以便进一步提供准确的报销信息。
The first error is same as: https://github.com/zilliztech/deep-searcher/issues/104 The second error is because the model is too small to answer the question well, similar to when you asked it to return a number but it returned a letter.
Although using
load_from_local_files(
paths_or_directory="/tmp/cl.txt",
collection_name="chailv",
collection_description="chailv",
force_new_collection=True, # If you want to drop origin collection and create a new collection every time, set force_new_collection to True
)
it still throw errors:
root@df017718c0f5:/deep-searcher# python mytest.py
create collection [chailv] successfully
Loading files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 18.40it/s]
Embedding chunks: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:02<00:00, 2.03s/it]
<think> Select agent [ChainOfRAG] to answer the query [出差住亲戚家可以报销多少] </think>
>> Iteration: 1
<think> Perform search [<think>
好的,我需要帮助用户回答关于出差住在亲戚家可以报销多少的问题。根据之前的回答,已经知道出差住宿费通常按照实际发生的费用报销,但不超过规定的标准。为了更详细地回答这个问题,我需要了解用户的具体情况。
首先,用户出差的地区可能影响报销的标准,因为不同地区的住宿费用标准可能不同。例如,一线城市和二线城市的报销标准可能不一样,所以询问出差的具体地点是有必要的。
其次,用户可能有不同的职务或职位,这也会影响报销的标准。通常,不同级别的员工有不同的报销额度,了解用户的职位可以帮助确定适用的具体标准。
另外,了解用户的行程天数也很重要,因为报销金额通常与天数有关,知道了天数,可以估算总的报销金额。
最后,询问是否有其他费用需要报销,比如交通费、餐费等,可以帮助用户提供更全面的信息,确保所有可报销的费用都被考虑在内。
综上所述,我需要进一步询问以下信息:
1. 出差的具体地点在哪里?
2. 您的职位是什么?
3. 出差的天数是多少天?
4. 是否有其他费用需要报销?
这样,我就能根据这些信息提供更准确的报销金额建议。
</think>
出差的具体地点在哪里?] on the vector DB collections: ['chailv', 'deepsearcher'] </think>
<search> Search [<think>
好的,我需要帮助用户回答关于出差住在亲戚家可以报销多少的问题。根据之前的回答,已经知道出差住宿费通常按照实际发生的费用报销,但不超过规定的标准。为了更详细地回答这个问题,我需要了解用户的具体情况。
首先,用户出差的地区可能影响报销的标准,因为不同地区的住宿费用标准可能不同。例如,一线城市和二线城市的报销标准可能不一样,所以询问出差的具体地点是有必要的。
其次,用户可能有不同的职务或职位,这也会影响报销的标准。通常,不同级别的员工有不同的报销额度,了解用户的职位可以帮助确定适用的具体标准。
另外,了解用户的行程天数也很重要,因为报销金额通常与天数有关,知道了天数,可以估算总的报销金额。
最后,询问是否有其他费用需要报销,比如交通费、餐费等,可以帮助用户提供更全面的信息,确保所有可报销的费用都被考虑在内。
综上所述,我需要进一步询问以下信息:
1. 出差的具体地点在哪里?
2. 您的职位是什么?
3. 出差的天数是多少天?
4. 是否有其他费用需要报销?
这样,我就能根据这些信息提供更准确的报销金额建议。
</think>
出差的具体地点在哪里?] in [chailv]... </search>
2025-03-06 07:55:52,264 [ERROR][handler]: RPC error: [search], <MilvusException: (code=2000, message=vector dimension mismatch, expected vector size(byte) 4096, actual 1024.: segcore error)>, <Time:{'RPC start': '2025-03-06 07:55:52.044667', 'RPC error': '2025-03-06 07:55:52.263984'}> (decorators.py:140)
2025-03-06 07:55:52,264 [ERROR][search]: Failed to search collection: chailv (milvus_client.py:414)
2025-03-06 07:55:52,264 - CRITICAL - fail to search data, error info: <MilvusException: (code=2000, message=vector dimension mismatch, expected vector size(byte) 4096, actual 1024.: segcore error)>
Traceback (most recent call last):
File "/deep-searcher/deepsearcher/vector_db/milvus.py", line 113, in search_data
search_results = self.client.search(
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/pymilvus/milvus_client/milvus_client.py", line 415, in search
raise ex from ex
File "/usr/local/lib/python3.11/site-packages/pymilvus/milvus_client/milvus_client.py", line 400, in search
res = conn.search(
^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/pymilvus/decorators.py", line 141, in handler
raise e from e
File "/usr/local/lib/python3.11/site-packages/pymilvus/decorators.py", line 137, in handler
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/pymilvus/decorators.py", line 176, in handler
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/pymilvus/decorators.py", line 116, in handler
raise e from e
File "/usr/local/lib/python3.11/site-packages/pymilvus/decorators.py", line 86, in handler
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/pymilvus/client/grpc_handler.py", line 836, in search
return self._execute_search(request, timeout, round_decimal=round_decimal, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/pymilvus/client/grpc_handler.py", line 777, in _execute_search
raise e from e
File "/usr/local/lib/python3.11/site-packages/pymilvus/client/grpc_handler.py", line 766, in _execute_search
check_status(response.status)
File "/usr/local/lib/python3.11/site-packages/pymilvus/client/utils.py", line 64, in check_status
raise MilvusException(status.code, status.reason, status.error_code)
pymilvus.exceptions.MilvusException: <MilvusException: (code=2000, message=vector dimension mismatch, expected vector size(byte) 4096, actual 1024.: segcore error)>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/deep-searcher/mytest.py", line 26, in <module>
result = query("出差住亲戚家可以报销多少") # Your question here
^^^^^^^^^^^^^^^^^^^^^
File "/deep-searcher/deepsearcher/online_query.py", line 10, in query
return default_searcher.query(original_query, max_iter=max_iter)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/deep-searcher/deepsearcher/agent/rag_router.py", line 69, in query
answer, retrieved_results, n_token_retrieval = agent.query(query, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/deep-searcher/deepsearcher/agent/chain_of_rag.py", line 196, in query
all_retrieved_results, n_token_retrieval, additional_info = self.retrieve(query, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/deep-searcher/deepsearcher/agent/chain_of_rag.py", line 178, in retrieve
intermediate_answer, retrieved_results, n_token1 = self._retrieve_and_answer(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/deep-searcher/deepsearcher/agent/chain_of_rag.py", line 124, in _retrieve_and_answer
retrieved_results = self.vector_db.search_data(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/deep-searcher/deepsearcher/vector_db/milvus.py", line 133, in search_data
log.critical(f"fail to search data, error info: {e}")
File "/deep-searcher/deepsearcher/tools/log.py", line 89, in critical
raise RuntimeError(message)
RuntimeError: fail to search data, error info: <MilvusException: (code=2000, message=vector dimension mismatch, expected vector size(byte) 4096, actual 1024.: segcore error)>
You can try to clear all collections in milvus. The main reason for this error is that there are other collections in milvus, but their dims are different.
config.set_provider_config("embedding", "OpenAIEmbedding", {"model": "bge-m3", "base_url": "http://192.168.23.10/v1","dimension": 1024})
最好再确认一下你的embedding模型支持的维度,我也遇到这个情况,最后发现设定的Dimension如果超了模型支持的Dimension值,就会出现这个错误,并且embedding返回的结果维度翻倍。我的设置768就好了