dify icon indicating copy to clipboard operation
dify copied to clipboard

Parallel branches are wrongly displayed in the Tracing UI and one step keeps running forevever

Open fniu opened this issue 1 year ago • 7 comments

Self Checks

  • [X] This is only for bug report, if you would like to ask a question, please head to Discussions.
  • [X] I have searched for existing issues search for existing issues, including closed ones.
  • [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [X] Please do not modify this template :) and fill in all the required fields.

Dify version

0.9.2

Cloud or Self Hosted

Cloud

Steps to reproduce

A set of parallel branches (1-B, 1-C and 1-D) comes at a Template node. Everythings works fine except for the UI, e.g., the Template node is missing for Branch-1-B, the steps for the Template node keeps running forever.

Screenshot 2024-10-21 114536

✔️ Expected Behavior

Every branch should end with the node "TEMPLATE", and should stop showing "Running" when the workflow is ended.

❌ Actual Behavior

Explained above and with a screenshot.

fniu avatar Oct 21 '24 09:10 fniu

I found a similar unsolved discussion regarding issues with parallel execution in the workflow. In version 0.8.0, there was a problem where the issue classifier was treated as a parallel execution branch, which might be related to the problem you're experiencing [1].

Regarding the Tracing UI, the structure and logic for displaying parallel branches are managed by the TracingPanel component. It uses a function buildLogTree to construct a tree of nodes from a list of NodeTracing objects. This tree structure is used to render the UI, and parallel nodes are displayed with a collapsible UI element [2].

For the issue of a node remaining in the 'Running' state indefinitely, I couldn't find specific conditions or errors that cause this. However, it might be related to how the platform handles the execution and termination of parallel branches, using events and a queue system to manage execution and status updates [3].

If the problem persists, it might be worth checking if there are any updates or patches available for version 0.9.2 that address these issues.

To continue talking to Dosu, mention @dosu.

dosubot[bot] avatar Oct 21 '24 09:10 dosubot[bot]

Can you provide a minimal DSL for us to reproduce this issue.

crazywoola avatar Oct 23 '24 01:10 crazywoola

After upgraded to 0.10.0, the same workflows run into another error:

Variable ans0 not found in variable pool

I suspect it has been fixed, by #9646. Could you confirm? (or I can re-check this issue when 0.10.1 is avaiable)

fniu avatar Oct 23 '24 10:10 fniu

With 0.10.1, the error "Variable ans0 not found in variable pool" still exists. Could it be a synchronization issue at node "TEMPLATE", where ans0 and ans2 are not yet available (still running at the previous LLM steps) when ans1 has arrived? If this is the case, this issue could be closed because the UI seems OK.

Screenshot 2024-10-23 215647

Screenshot 2024-10-23 220145

fniu avatar Oct 23 '24 20:10 fniu

image

I can't reproduce this problem. Can you try again to see if it can work properly?

laipz8200 avatar Oct 24 '24 11:10 laipz8200

@laipz8200

I can't reproduce this problem. Can you try again to see if it can work properly?

The error would show if you add knowledge retrieval for each branch before the LLM step. At least this is what happened to me. FYI, I see someone else has reproted the same issue in #9844.

fniu avatar Oct 26 '24 16:10 fniu

image

Currently, the situation shown in the image is not supported. Deleting this branch can resolve the issue.

laipz8200 avatar Oct 26 '24 17:10 laipz8200