[Bug]: Chat and query are too slow
Is there an existing issue for the same bug?
- [X] I have checked the existing issues.
Branch name
main
Commit ID
none
Other environment information
win11
Actual behavior
上传的文件多了之后,对话和查询巨慢,需要好几分钟,无法忍受和正常使用
Expected behavior
上传的文件多了之后,对话和查询巨慢,需要好几分钟,无法忍受和正常使用
Steps to reproduce
对话和查询非常慢,耗时好几分钟,无法忍受
Additional information
上传的文件多了之后,对话和查询巨慢,需要好几分钟,无法忍受和正常使用
对话和查询的时候为什么只用CPU单线程,而不用gpu推理,这是严重问题,根本没法使用
If there're much documents, please expand ES nodes to make a cluster.
确实有这个问题,添加日志也能看到,ES检索慢的很
再加上rank处理,要七八分钟才出结果
知识库上传了大约100万条数据,这到底是哪里的问题呢
3546 result need to be re-rank, it would not be fast.
The time for ES retrieval itself is quite long. May I ask how to control the number of output results for ES retrieval?
Attached is a sample of the parameters for a retrieval request
{ "query": { "bool": { "must": [ { "query_string": { "fields": [ "title_tks^10", "title_sm_tks^5", "important_kwd^30", "important_tks^20", "content_ltks^2", "content_sm_ltks" ], "type": "best_fields", "query": "((案情)^0.25144159157178586 (案例)^0.24685589599849458", "boost": 1, "minimum_should_match": "30%" } } ], "filter": [ { "terms": { "kb_id": [ "6b70bacc8a0311efb5c80242ac180006" ] } }, { "bool": { "must_not": [ { "range": { "available_int": { "lt": 1 } } } ] } } ], "boost": 0.05 } }, "from": 0, "size": 90, "knn": { "field": "q_768_vec", "k": 3, "similarity": 0.1, "num_candidates": 6, "query_vector": [ 0.028860345482826234, ... -0.01591445878148079 ], "filter": { "bool": { "must": [ { "query_string": { "fields": [ "title_tks^10", "title_sm_tks^5", "important_kwd^30", "important_tks^20", "content_ltks^2", "content_sm_ltks" ], "type": "best_fields", "query": "((案情)^0.25144159157178586 (案例)^0.24685589599849458", "boost": 1, "minimum_should_match": "30%" } } ], "filter": [ { "terms": { "kb_id": [ "6b70bacc8a0311efb5c80242ac180006" ] } }, { "bool": { "must_not": [ { "range": { "available_int": { "lt": 1 } } } ] } } ], "boost": 0.05 } } } }
Refer to this about pagination of ES. The speed of retrieval for ES depends on some fators like, number of docs indexed, RAM size, disk speed.
解析巨慢的问题,调整了下es的参数,得到数10倍提升,批量解析不受影响。具体修改的参数如下(可以试下,根据实际情况调整): - bootstrap.memory_lock=true #从false改为了true,锁定内存避免使用到磁盘 - "ES_JAVA_OPTS=-Xms3556m -Xmx3556m" - indices.memory.index_buffer_size=35% - thread_pool.search.size=16 #根据CPU调整 - thread_pool.search.queue_size=1000 - action.destructive_requires_name=false#允许删除索引时不提供索引名(仅在测试环境中使用,生产环境不建议)