Chat assistant response slow
Describe your problem
I deploy ragflow (infiniflow/ragflow:v0.9.0) on aws eks. I have two nodes to run all the dependencies ( redis, mysql, minio, elasticsearch) Nodes detail: RAM: 64 GB CPU: 8 GPU: 0 Disk: 100 GB Physical Processor: Intel Xeon Platinum 8175:
Locally host ollama which has llama3:latest as chat model and mxbai-embed-large as embeded model. I have one node for it Nodes detail: RAM: 64 GB CPU: 8 GPU: 0 Disk: 100 GB Physical Processor: Intel Xeon 8375C (Ice Lake) GHZ: 3.5
The parsing document workwell, It tooks around 5 mins for large document. but the chat assistant is very slow, I only say "hi" and it tooks 1 mins to search and reponse to me. Do you have any idea why It is slow like that, even my compute resource is big ?s
It might be caused by searching from ES whoes performance is highly related to RAM and docs it indexed. One PDF can generate thousands of docs into ES.
I have check ES resource consumption, It uses only 8GB of RAM. It still has a lots of available resources. Do you have any idea to optimize the performance of the assistant in my case (one simple question takes 2 mins and complex questions do not reponse )
Click the lamp and check the time elapsed down there.
There is no lamp? I am using ragflow version 0.15.1. What is the recommended minimum hardware specifications for smooth chat experience? I know this is subjective, but I am using the minimum specs (4 CPU core, 16M RAM, no GPU) but the memory runs out before the response can be generated.
Did you find any solution for this slowness ?