dify Inconsistent Timeout Handling During Batch Inference with CSV in Dify

Inconsistent Timeout Handling During Batch Inference with CSV in Dify

Open guyfar opened this issue 1 year ago • 1 comments

Self Checks

[X] I have searched for existing issues search for existing issues, including closed ones.
[X] I confirm that I am using English to file this report (我已阅读并同意 Language Policy).

Dify version

0.4.8

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

Navigate to the Dify batch inference interface.
Upload a CSV file to run batch inference tasks.
Initiate the batch inference process.
Observe the error message indicating a potential timeout issue.
Check the logs and annotation page for inference results despite the error.

✔️ Expected Behavior

I expected that the batch inference tasks would complete without any timeout errors. The inference process should handle the tasks within the set timeout limits, and if successful, not raise any error messages.

❌ Actual Behavior

An error message suggesting a timeout issue is raised during the batch inference process. However, despite this error, the inference tasks seem to complete normally, and the results are visible in the logs and annotation pages. This inconsistency suggests that the timeout error may be incorrectly reported or that the timeout settings are not aligned with the actual processing time required for inference tasks. bug_tip1 bug_tip2

Jan 18 '24 06:01 guyfar

🤖

Hello @guyfar! I'm Dosu, a friendly bot here to help you out with dify. I can assist with troubleshooting, answer questions, and even guide you on how to contribute to the project. Consider me your sidekick while you wait for a human maintainer. I see you have an issue you'd like to discuss, let's dive into it!

Thank you for your detailed report. It seems like you're experiencing an inconsistency between the timeout error message and the actual completion of the batch inference tasks.

Based on the information you provided and the context I have, the timeout setting for tasks in the Dify codebase is defined in the Timeout object within the _to_credential_kwargs method of the _CommonAzureOpenAI class. The total timeout is set to 315.0 seconds, with a read timeout of 300.0 seconds, a write timeout of 10.0 seconds, and a connect timeout of 5.0 seconds. You can find this in the relevant code.

The process flow for batch inference tasks in the Dify codebase, specifically in the ReplicateEmbeddingModel class, involves several steps including invoking the model, generating embeddings, calculating the number of tokens, and calculating usage. You can find this in the relevant code.

The Dify codebase handles errors during the invocation of the large language model through a mapping of errors. This mapping is defined in the _invoke_error_mapping property of the TongyiLargeLanguageModel class. For timeout issues during batch inference tasks, the relevant error types would be InvokeConnectionError and InvokeServerUnavailableError. You can find this in the relevant code.

Given this information, it's possible that the timeout error you're seeing is a false positive, triggered by a delay in the network or server response, but not actually affecting the completion of the batch inference tasks. However, without more specific information about the error message and the conditions under which it occurs, it's difficult to say for certain.

If you could provide more details about the error message, such as the exact wording and any associated error codes, as well as any patterns you've noticed in when it occurs (e.g., only for certain CSV files, only when the server is under heavy load, etc.), that would be very helpful in diagnosing the issue. Additionally, if you could share the logs from when the error occurs, that could provide valuable clues.

I hope this information is helpful, and I look forward to your response so we can get to the bottom of this issue.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

Jan 18 '24 06:01 dosubot[bot]

dify dify copied to clipboard

Inconsistent Timeout Handling During Batch Inference with CSV in Dify

Self Checks

Dify version

Cloud or Self Hosted

Steps to reproduce

✔️ Expected Behavior

❌ Actual Behavior

Sources

dify
dify copied to clipboard