ragflow
ragflow copied to clipboard
[Question]: Inconsistent AI Response Lengths between Ragflow and Direct ollama Interactions
Describe your problem
Hello Ragflow team,
I am encountering an issue where the AI responses I receive when using ollama through Ragflow are significantly shorter and less detailed compared to when I interact directly with ollama.
Could you please help understand why there is such a discrepancy and how it might be resolved?
I'm encountering the same issue. Docker dev installed yesterday. Responses are cut off. I increased output tokens to 2048 but that doesn't make any difference.
Tried with several ollama models (llama3, mistral-openorca, llama3-chatqa) with same truncated results.
Furthermore and related to this, RAGflow says that the Knowledge Base does not provide information, but it identifies the correct file and even the correct section as source for the response.
I'm encountering the same issue. Docker dev installed yesterday. Responses are cut off. I increased output tokens to 2048 but that doesn't make any difference.
Tried with several ollama models (llama3, mistral-openorca, llama3-chatqa) with same truncated results.
Furthermore and related to this, RAGflow says that the Knowledge Base does not provide information, but it identifies the correct file and even the correct section as source for the response.
If it refers to correct chunks (you can test it in retrieval test) but the answer is still about nothing, it's probably caused by the LLM since its capability shortage to under stand the relevance between question and chunks.
Describe your problem
Hello Ragflow team,
I am encountering an issue where the AI responses I receive when using ollama through Ragflow are significantly shorter and less detailed compared to when I interact directly with ollama.
Could you please help understand why there is such a discrepancy and how it might be resolved?
![]()
What about disable max token toggle in dialog setting?
I disabled the "max token toggle" as mentioned and it worked. Thanks!
Worked for me as well and got a full and proper response from the LLM to the same question now. Thanks!
max token toggle
the images can't see now, where I can set the 'max token toggle'?
max token toggle
the images can't see now, where I can set the 'max token toggle'?
Chat properties -> model settings -> toggle switch at bottom:
Describe your problem
Hello Ragflow team,
I am encountering an issue where the AI responses I receive when using ollama through Ragflow are significantly shorter and less detailed compared to when I interact directly with ollama.
Could you please help understand why there is such a discrepancy and how it might be resolved?
![]()
btw, did you running ollama in Macosx ?
max token toggle
the images can't see now, where I can set the 'max token toggle'?
Chat properties -> model settings -> toggle switch at bottom:
got it, thank u.
max token toggle
the images can't see now, where I can set the 'max token toggle'?
Chat properties -> model settings -> toggle switch at bottom:
does work!!
Yes, "Chat properties -> model settings -> toggle switch at bottom" helps! Thanks!
Yes, "Chat properties -> model settings -> toggle switch at bottom" helps! Thanks!
How about in agent? Same issue found in agent.
same issue in agent. In chat response is big, but agent response is limited.
when testing 0.17.2 slim Chat's response is big, but agent embedded to website response is limited (truncated).
Note the LLM is claude 3.5 and provider is AWS Bedrock
I was reading pull request 845. There Max Tokens is avaliable for deepseek-chat. But I don't have this option in bedrock:




