dify icon indicating copy to clipboard operation
dify copied to clipboard

Performance differences between Ollama and Ollm when running bge-m3

Open wyanghu opened this issue 10 months ago • 1 comments

Self Checks

  • [x] I have searched for existing issues search for existing issues, including closed ones.
  • [x] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [x] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [x] Please do not modify this template :) and fill in all the required fields.

1. Is this request related to a challenge you're experiencing? Tell me about your story.

I have configured Ollama and GPustack on Dify to run the bge-m3 model. The bge-m3 model running on GPustack was also downloaded from Ollama and is executed using vLLM. However, I noticed that when calling the bge-m3 model on GPustack, the GPU computing resources are not fully utilized, with utilization below 20%, whereas when calling the bge-m3 model on Ollama, the GPU utilization reaches over 80%.

I have confirmed that I am calling the same model in both cases.

Has anyone encountered this issue before? Could you share your thoughts and solutions with me? I would really appreciate it.

2. Additional context or comments

No response

3. Can you help us with this feature?

  • [ ] I am interested in contributing to this feature.

wyanghu avatar Mar 02 '25 13:03 wyanghu

@wyanghu Hello there! 👋 I'm here to help you with any bugs, questions, or even becoming a contributor. Let's squash those bugs together!

I found a similar unsolved discussion that might be relevant to your issue: Why is there a significant difference between the conversation using Dify + Ollama and the conversation directly through Ollama, even though the same model is used?. This discussion was last updated on February 21, 2025, and it might provide some insights or lead to a solution for the performance differences you're experiencing with the bge-m3 model.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

dosubot[bot] avatar Mar 02 '25 13:03 dosubot[bot]