dify
dify copied to clipboard
ollama-type model, the conversation is missing, it's stuck, and can't return complete content.
Self Checks
- [X] I have searched for existing issues search for existing issues, including closed ones.
- [X] I confirm that I am using English to file this report (我已阅读并同意 Language Policy).
Dify version
0.5.3
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
ollama-type model, the conversation is missing, it's stuck, and can't return complete content.
✔️ Expected Behavior
No response
❌ Actual Behavior
No response
🤖
Hey there, @XJALYN! Great to see you back. Hope you're doing well!
I'm sorry for the inconvenience you're experiencing with the ollama-type model. The issue you're facing could be due to several reasons. Here are some potential causes and solutions:
-
API Request Failure: If the status code of the response is not 200, an
InvokeError
is raised. This could be due to issues with the endpoint URL, the headers, or the payload of the request. Check the status code and the response text for more details. -
JSON Decoding Error: If the response from the server cannot be decoded into JSON, a
json.JSONDecodeError
is raised. This could be due to the server returning a non-JSON response. Check the server logs for more details. -
Stream Ended Prematurely: If the stream ends before the
done
flag is set toTrue
, the generator will yield a finalLLMResultChunk
with an empty message and a finish reason of "Non-JSON encountered." This could be due to network issues or issues with the server. -
Model Parameters: Incorrect or inappropriate model parameters could also cause the model to get stuck. Check the values of the parameters like 'temperature', 'top_p', 'top_k', 'repeat_penalty', 'num_predict', 'mirostat', etc.
-
Model Credentials: If the model credentials are not valid, a
CredentialsValidateFailedError
is raised. Check the credentials for the model. -
Model Mode: If the model mode is not correctly set to either
LLMMode.CHAT
orLLMMode.GENERATE
, it could cause issues. Check the mode of the model.
To debug the issue, you can add logging statements in the _invoke
, _generate
, _handle_generate_response
, and _handle_generate_stream_response
methods to log the request and response details. This will help you identify where exactly the issue is occurring. You can find these methods in the ollama-type model file.
Unfortunately, I'm unable to view the image you've attached. Could you please provide more details about the error message or the behavior you're experiencing? This will help me provide a more accurate solution.
Sources
- api/core/model_runtime/model_providers/ollama/llm/llm.py
- api/core/model_runtime/model_providers/ollama/ollama.yaml
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
How about setting max_tokens
parameter to 1000 or a bit larger and give it a try?