[Question]: Retrieval is very slow, and time is mainly spent on generating answers and tune question.
Describe your problem
ragflow: v0.15.1-slim chat model: qwen2.5:14b embeding model: bge-m3
Retrieval too slowly,This is the time spent on each step
I hope to optimize the response speed, but I don't know how to start. Can I adjust the docker images startup parameters? If possible, which components and parameters should be adjusted
the same problem for me. It's so frustrating! The retrieval speed is making ragflow unusable.
Do you use re-rank model here?
the same problem for me. I'm also encountering the same problem. Sometimes the same problem is relatively fast, and sometimes it's very slow. May I ask if it has anything to do with the API I'm using:
Currently,
chat api uses the online siliconflow.cn (deepseek_v3),
embedding model api: siliconflow.cn (BAAI/bge-m3)
Do you use re-rank model here?
Not useing re-rank model
How many chunks do you estimate it has?
Same to me.
Total: 5999.8ms Check LLM: 6.8ms Create retriever: 2.2ms Bind embedding: 68.4ms Bind LLM: 70.3ms Tune question: 2462.6ms Bind reranker: 0.0ms Generate keyword: 0.0ms Retrieval: 1501.1ms Generate answer: 1888.5ms
I have two PDF documents that aren't large. Is there a way to reduce the Tune question time?
Disable it. @chat19
现在可以了。感谢