specification icon indicating copy to clipboard operation
specification copied to clipboard

Backward Compatibility: Parallel Branch does not support direct use of states definitions

Open yzhao244 opened this issue 4 years ago • 13 comments
trafficstars

What is the question:

Hi, I have this issue opened and would like to discuss the spec regarding parallel branch does not support direct use of states definitions when comparing spec 0.1.

If my understanding is correct, currently, the parallel state branch supports actions definition which can use subFlowRef to support a more complex flow such as switch and so on.

However, using subFlowRef in practice requires that end-user has to successfully define a subflow first for getting a workflowId. After obtaining the workflow ID, end-user can start to use the ID to define the parallel state. I think it brings more complexity if end-user only builds a simple parallel workflow which branch only contains a switch, or a operation state and end-user may not need a subFlowRef ID for reusable purpose.

I am sorry to bring back the example of AWS Step function as my reference lol, which allows directly defining states in branch. I feel our specs and theirs have things in common lol..

What would you like to be added: Is it okay to add back the states array to branch of parallel state definition.

Why is this needed:

  1. It ensures backward compatibility.
  2. It is straightforward to build workflow which allows end-user to directly define whatever the states in branch

yzhao244 avatar Jun 21 '21 16:06 yzhao244

I can definitely understand your point of view on this. The changes from v0.1 for this were:

  1. We first removed the ability to use states in branches and allowed for either actions or subflow state only. The reason for this was that you can build very complex nested structures like that, for example you can nest parallel states inside parallel states and define layers of error handling...using SubFlow states at that point seems as a reasonable alternative as runtime can look up the control flow logic via id and also you can make it reusable across multiple branches
  2. We then removed SubFlow State and made it into an action...that to me is something we should look at again honestly. I think that making it into an action did bring some benefits but it introduced some others (like forced looping via control flow logic and not the user friendly "repeat" we had on subflow states).

Can you show me an example of ASL where you find the use of states inside branches useful ?

tsurdilo avatar Jun 22 '21 00:06 tsurdilo

The following example is from Amazon step function where end-user can directly define states in branch. It is a very simple parallel state use case which has no nested structures and end-user does not require to get a subflow id when building a simple workflow... However, building the same workflow in current spec, it requires end-user to build subflow first and get an ID. After obtaining a subflow Id, end-user can start to define parallel state.

image The parallel state should look like this

"MyParallelState": {
  "Type": "Parallel",
  "InputPath": "$",
  "OutputPath": "$",
  "ResultPath": "$.ParallelResultPath",
  "Next": "SetCartCompleteStatusState",
  "Branches": [
    {
      "StartAt": "UpdateMonthlyUsageState",
      "States": {
        "UpdateMonthlyUsageState": {
          "Type": "Task",
          "InputPath": "$",
          "OutputPath": "$",
          "ResultPath": "$.UpdateMonthlyUsageResultPath",
          "Resource": "LambdaARN",
          "Retry": [ {
            "ErrorEquals": ["HandledError"],
            "IntervalSeconds": 1,
            "MaxAttempts": 2,
            "BackoffRate": 2.0
         } 
          "End": true
        }
      }
    },
    {
      "StartAt": "QueueTaxInvoiceState",
      "States": {
        "QueueTaxInvoiceState": {
          "Type": "Task",
          "InputPath": "$",
          "OutputPath": "$",
          "ResultPath": "$.QueueTaxInvoiceResultPath",
          "Resource": "LambdaARN",
          "Retry": [ {
            "ErrorEquals": ["HandledError"],
            "IntervalSeconds": 1,
            "MaxAttempts": 2,
            "BackoffRate": 2.0
         } 
          "End": true
        }
      }
    }

yzhao244 avatar Jun 28 '21 16:06 yzhao244

@yzhao244 thanks for the example! May I offer a counter-point :) If you look UpdateMonthlyUsageState and QueueTaxInvoiceState they are just actions, meaning they just invoke a service. For this example using our DSL actions makes more sense. This is the exact same as if you had in each branch an operation state with a single action :) so using actions in this case (and not dummy operation states) not only seems more intuitive, but also reduces the number of lines of json/yaml you have to write.

I was looking for an example where there is actual control-flow logic inside each branches. Off of your example I would actually argue that what we are doing atm is indeed better :)

tsurdilo avatar Jun 28 '21 16:06 tsurdilo

@tsurdilo though I mostly agree with your last comment, I think @yzhao244 has a point, too. What if I want to execute two switch cases in parallel? Let's imagine you have a package, and you want to publish it in parallel in both staging and prod. The publishing processes would switch on the package type to determine on which app you actually want to publish it. AFAIK, this cannot be done now without subflows or complex jq expressions.

Now, I've had a similar concern with my a client, and we first ended using very complex jq expressions to do so, until we realized it was extremely hard to read and understand with no advanced knowledge of jq and ended up using a sublow, for a single condition check, basically.

In my opinion, we should think of a way to do so. Probably not using sub-states, as it looks confusing and goes against the actual state concept, but maybe by adding, for example, a condition property on actions.

@yzhao244 @tsurdilo WDYT?

cdavernas avatar Jul 03 '21 11:07 cdavernas

I was not dissageeing with this issue just wanted to find use case and you provided :)

i wonder if for it it would make more sense to add branch data filter instead, where you can inject / overwrite state data field for that branch (one sets prod the other dev or whatever) .

my thinking is that if we allow explicit control flow logic in branches it will end up creating more issues than bring value eapecially for large processes (imagine 50+ states in branches and that maintenance)

On Sat, Jul 3, 2021 at 7:05 AM cdavernas @.***> wrote:

@tsurdilo https://github.com/tsurdilo though I agree with your last comment, I think @yzhao244 https://github.com/yzhao244 has a point. What if I want to execute two switch cases in parallel? Let's imagine you have a package, and you want to publish it in both staging and prod. The publishing process switches on the package type to determine on which app you actually want to publish it.

Now, I've had a similar concern with my a client, and we first ended using very complex jq expressions to do so, until we realized it was extremely hard to read and understand with no advanced knowledge of jq and ended up using a sublow, for a single condition check, basically.

In my opinion, we should think of a way to do so. Probably not using sub-states, as it looks confusing and goes against the actual state concept, but maybe by adding, for example, a condition property on actions.

@yzhao244 https://github.com/yzhao244 @tsurdilo https://github.com/tsurdilo WDYT?

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/serverlessworkflow/specification/issues/413#issuecomment-873390593, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA5E7WG2QHZGWEOM5Y5ZUTTV3VIZANCNFSM47B6WW7Q .

tsurdilo avatar Jul 03 '21 12:07 tsurdilo

@tsurdilo @cdavernas Thanks for the replies, guys. :) . Actually, the above AWS DSL example allows UI Console to easily render a graphical model in real-time since it allows end-user directly defines control logic which I believe Google Workflow, Ali Serverless Workflow both allow explicit control flow as well for the purpose of easily rendering graphical model.

In our spec, since using subFlowRef in practice requires a workflowId, it requires that end-user has to create subFlow first and then gets a workflowId. Finally, end-user has to put that workflowId in the branches. My personal feeling is a bit complex process and makes UI render graphical model a bit more complicated comparing to allowing explicit control flow.

@tsurdilo I looked at your counter-point... it is a very good point of using actions lol.. However, I updated my example with adding retries which our actions do not support retry. :) .. Actually, I totally agree that for better maintenance for large flows, definitely, using subworkflow ID is way better than explicit control flow logic. However, allowing explicit control flow logic in branches shouldn't be a problem because it sounds like a common practice that end-user would want to do. :)

Is it possible to support both explicit control flow in branches and subworkflowId which end-user can choose for better maintenance for large flows.

yzhao244 avatar Jul 19 '21 18:07 yzhao244

@tsurdilo @cdavernas For better supporting explicit control flow, I am thinking to set a limit on number of nested layers if end-user chose to define explicit control flow in branches so that our spec can ensure backward compatibility of supporting explicit flow. The following is a suggested example spec of Branch definition for supporting both actions and states definitions. :)

Parallel State: Branch

Parameter Description Type Required
name Branch name string yes
actions Actions to be executed in this branch array no
maxLayers Limits to max number of nested layers integer no
states States to be executed in this branch array no
timeouts Branch specific timeout settings object no

yzhao244 avatar Aug 04 '21 17:08 yzhao244

@yzhao244

I updated my example with adding retries which our actions do not support retry. :)

they do now (will be hopefully added to 0.7 release :) ) - https://github.com/serverlessworkflow/specification/pull/435

For better supporting explicit control flow, I am thinking to set a limit on number of nested layers

Overall I do think that supporting both is in the end what we want to do. The way I'd like to do it however is a little different for 3 reasons:

  1. states inside a branch have to conform to the "workflow control flow logic" meaning they have to have a starting state and define one or more end definitions.
  2. states inside the branches should not be able to be transitioned to from a) other branches in parallel state b) other states in the main workflow control flow logic.
  3. states inside a branch should not be able to transition to 1) other branch states b) other core workflow states

I would like to create a "grouping" of states, with preferably a lable/id attached to them. From the parallel state branch then you can then simply reference this group of states instead of having them hard-coded. States in the core workflow definition would all be in the "main" group. And for tooling then it would also make it easier as then validation can check if a state in the "main" group tries to transition to a state which is not in that same group...and raise validation error(s) and vice versa.

WDYT?

tsurdilo avatar Aug 04 '21 17:08 tsurdilo

@qjl1988

yzhao244 avatar Aug 18 '21 21:08 yzhao244

@tsurdilo I read the ActionRetries in 0.7 release. Good Work :) ..
"grouping" of state sounds an interesting idea. However, since parallel branch state still needs to reference the group, then how is it different from referencing to a subworkflow Id in current spec. Maybe it is just me lol having difficulty of visualizing grouping of states or maybe I misunderstood something here. lol

yzhao244 avatar Aug 18 '21 21:08 yzhao244

@yzhao244 yeah, we can/should definitely talk about it to figure it out. Let's plan on that.

tsurdilo avatar Aug 18 '21 23:08 tsurdilo

@tsurdilo created this one on discussion board :) . https://github.com/serverlessworkflow/specification/discussions/466

yzhao244 avatar Sep 02 '21 19:09 yzhao244

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Oct 18 '21 00:10 github-actions[bot]