conductor icon indicating copy to clipboard operation
conductor copied to clipboard

Debug - Step Over a Task

Open jkaipa opened this issue 5 years ago • 8 comments

@apanicker-nflx @kishorebanala I have a workflow with all event tasks that involve several tasks in a fork. An event is published soon after a event task is scheduled. When it comes to fork, multiple events related to all the tasks of that fork are published at the same time. Having fine control will help debug each task before the scheduling happen. -> Execute workflow with a debug flag -> Just before the next task is to be scheduled, if the debug flag is set, then wait for a trigger -> To implement the trigger, construct a new rest endpoint.

Do we already have a way to figure this out? If not, what are your thoughts?

jkaipa avatar Mar 28 '19 21:03 jkaipa

@jkaipa Interesting. If we're not wrong, we thought that you're looking at a debugging system with breakpoints.

How about dry runs? We don't see the value breakpoints would provide in workflow execution. I.e the state in between and end of the workflow wouldn't differ much, because of the way inputs and outputs are wired. But, if we can have a dry run system with expected and actual values provided upfront, that could make a much better case for evaluating and regression testing the workflows.

Thoughts?

kishorebanala avatar Apr 02 '19 00:04 kishorebanala

@Ismaley Is this something related to debug workflow implementation you mentioned about? If so, and if you think your implementation can be contributed to oss, do you mind adding the implementation details to this thread?

kishorebanala avatar Jun 17 '19 23:06 kishorebanala

@kishorebanala yes, this illustrates well an use case for the debug feature we've built and we achieved that with the help of external applications we developed here.

But relating the feature to @jkaipa mentioned and keeping that in the conductor's domain, the "debug flag" we implemented as a copy of the current workflowDef that is going to be used for the debuggable execution. Then by adding wait tasks to this copied workflow, we achieved a similar behavior to "breakpoints" since the wait tasks also prevent the next task to be scheduled, once the signal is sent to the wait task, the execution can continue normally.

@emiteze correct me if I missed anything on the explanation.

My personal desire is that the implementations we've made could be contributed to oss, however I don't own the repositories nor the copyrights, so we still gotta discuss our collaboration's further steps and see what can be achieved. :)

Ismaley avatar Jun 18 '19 20:06 Ismaley

Any update on this feature @Ismaley @kishorebanala ?

jkaipa avatar Sep 04 '19 19:09 jkaipa

@jkaipa I believe that we'll end making our features open source as well, right now we're testing that only within the company. But we could discuss how to implement on conductor without the need of an external application. I'd be glad to help on that :)

Ismaley avatar Sep 16 '19 17:09 Ismaley

Appreciate it @Ismaley . Are you cloning the workflow and adding wait task for each of the workflow task? or considered below approach

-> Execute workflow with a debug flag
-> Just before the next task is to be scheduled, if the debug flag is set, then wait for a trigger
-> To implement the trigger, construct a new rest endpoint. 

jkaipa avatar Sep 16 '19 18:09 jkaipa

On the external application, yes, the workflow def is cloned with a snapshot flag because once we push the workflow def to conductor, we make it read only on our application. Then we add wait tasks as "breakpoints" before the next task to be executed. The trigger to proceed the wait test is done using the /tasks conductor endpoint to update the wait task status to COMPLETED.

Making the Workflow def read only on our application was the main reason we made the debug feature, we noticed that it's usual to people create several versions of a workflow before it was stable and working as they expected.

On conductor we could implement a similar approach, but maybe implement the trigger on a separated endpoint. We also would need to consider the persistence of the snapshot execution's data. Maybe it could be optional to retain the execution data after it's completed or keep the data only on elastic search.

Ismaley avatar Sep 18 '19 18:09 Ismaley

Did anyone make any progress on this feature? Thanks

jkaipa avatar Apr 01 '23 02:04 jkaipa