[Question]: Cannot utilize GPU even with docker-compose-gpu.yml
Describe your problem
In the parsing files and reranking phase of the chat, the GPU cannot be utilized (nvidia-smi shows 0% utilization), meaning that all models built into the Docker cannot use the GPU even after using the GPU version of the yml. I'd like to ask how this issue can be resolved? Or does ragflow have strict requirements regarding the CUDA version?
Utilization of GPU memory is 0, isn't it? Are you using BCE/BGE embedding? Otherwise, GPU will not be used.
I am using the BGE embedding, and I deploy a llama locally and use xinferece to incorporate it into RAGflow. However, except for llama inference, no other tasks will utilize the GPU.