dify icon indicating copy to clipboard operation
dify copied to clipboard

ollama-type model, the conversation is missing, it's stuck, and can't return complete content.

Open XJALYN opened this issue 1 year ago • 1 comments

Self Checks

Dify version

0.5.3

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

image

ollama-type model, the conversation is missing, it's stuck, and can't return complete content.

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

XJALYN avatar Feb 05 '24 10:02 XJALYN

🤖

Hey there, @XJALYN! Great to see you back. Hope you're doing well!

I'm sorry for the inconvenience you're experiencing with the ollama-type model. The issue you're facing could be due to several reasons. Here are some potential causes and solutions:

  1. API Request Failure: If the status code of the response is not 200, an InvokeError is raised. This could be due to issues with the endpoint URL, the headers, or the payload of the request. Check the status code and the response text for more details.

  2. JSON Decoding Error: If the response from the server cannot be decoded into JSON, a json.JSONDecodeError is raised. This could be due to the server returning a non-JSON response. Check the server logs for more details.

  3. Stream Ended Prematurely: If the stream ends before the done flag is set to True, the generator will yield a final LLMResultChunk with an empty message and a finish reason of "Non-JSON encountered." This could be due to network issues or issues with the server.

  4. Model Parameters: Incorrect or inappropriate model parameters could also cause the model to get stuck. Check the values of the parameters like 'temperature', 'top_p', 'top_k', 'repeat_penalty', 'num_predict', 'mirostat', etc.

  5. Model Credentials: If the model credentials are not valid, a CredentialsValidateFailedError is raised. Check the credentials for the model.

  6. Model Mode: If the model mode is not correctly set to either LLMMode.CHAT or LLMMode.GENERATE, it could cause issues. Check the mode of the model.

To debug the issue, you can add logging statements in the _invoke, _generate, _handle_generate_response, and _handle_generate_stream_response methods to log the request and response details. This will help you identify where exactly the issue is occurring. You can find these methods in the ollama-type model file.

Unfortunately, I'm unable to view the image you've attached. Could you please provide more details about the error message or the behavior you're experiencing? This will help me provide a more accurate solution.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

dosubot[bot] avatar Feb 05 '24 10:02 dosubot[bot]

How about setting max_tokens parameter to 1000 or a bit larger and give it a try?

takatost avatar Feb 07 '24 08:02 takatost