specification icon indicating copy to clipboard operation
specification copied to clipboard

Find a way to name processed correlation keys

Open cdavernas opened this issue 3 years ago • 14 comments
trafficstars

What would you like to be added:

To allow for a way to name processed correlation keys.

What I propose is to add a new key or keyName or whatever property to the correlationDef object, so that multiple events can address different correlation keys:

...
events:
- name: ProductCreated
  kind: consumed
  source: ...
  type: ...
  correlation:
  - contextAttributeName: subject
    correlationKey: product  # the processed key will be stored as 'product'
- name: ProductSold
  kind: consumed
  source: ...
  type: ...
  correlation:
  - contextAttributeName: subject
    correlationKey: product # the processed key will be stored as 'product'
- name: OrderCreated
  kind: consumed
  source: ...
  type: ...
  correlation:
  - contextAttributeName: subject
    correlationKey: order # the processed key will be stored as 'order'
...

Why is this needed:

Consider the following definition:

...
events:
- name: MyEvent1
  kind: consumed
  source: ...
  type: ...
  correlation:
  - contextAttributeName: subject
- name: MyEvent2
  kind: consumed
  source: ...
  type: ...
  correlation:
  - contextAttributeName: subject
 - name: MyOtherEventThatHasNothingToDoWithTheTwoPrevious
  kind: consumed
  source: ...
  type: ...
  correlation:
  - contextAttributeName: subject
...

Image we have a sequential flow such as the following:

  • Consume MyEvent1 with subject 'foo'
  • Consume MyEvent2 with subject 'foo'
  • Consume MyOtherEventThatHasNothingToDoWithTheTwoPrevious with subject 'bar'

What would happen on Synapse is:

  1. Check for a match for a correlation mapping with the defined correlations for MyEvent1 (i.e. subject)
  2. No correlation mapping exist, so the event is consumed, and the mapping is set to 'subject: foo'
  3. Check for a match for a correlation mapping with the defined correlations for MyEvent2 (i.e. subject)
  4. Match exists, and is set to 'foo', so the event is correlated and therefore consumed
  5. Check for a match for a correlation mapping with the defined correlations for MyOtherEventThatHasNothingToDoWithTheTwoPrevious(i.e. subject)
  6. Match exists, but the value is not set to 'foo', so the event is consumed, and the mapping is set to 'subject: bar' ===> BOOOOOOOOM

I could very easily find a dirty work around that problem, but like I said, it would be dirty. This is, IMO, up to the workflow designer to decide how a newly created correlation key should be named.

Note: not setting the contextAttributeValue (and that should be better explained in the spec) basically says that first event to come is gonna set the correlation key based on the defined context attribute (i.e. subject).

cdavernas avatar Sep 14 '22 16:09 cdavernas

We could also possibly introduce a new $CORRELATION jq named parameter, which would allow workflows to retrieve and play with those keys.

cdavernas avatar Sep 14 '22 18:09 cdavernas

@ricardozanini @tsurdilo May I proceed with a PR to address that? It's a minor though very important feature for correlation.

cdavernas avatar Oct 27 '22 15:10 cdavernas

Ideally, we could also take the advantage of that feature to revamp correlation and add support for (discouraged but sometimes necessary) payload based (vs context attributes) correlation:

Correlate incoming event of drfined type and source using both (or maybe exclusive too?) event context attributes and payload:

events:
  - name: correlatedEvent
    kind: consumed
    type: com.test/cloudevents/test
    correlate:
      - byContextAttribute:
          name: Subject
          value: lights-.*
          key: lightId
      - byPayloadProperty:
          property: 'zone' #could also be a runtime expression used to retrieve the property value
          key: zoneId

cdavernas avatar Oct 27 '22 15:10 cdavernas

Finally, to support all use cases, we could also add a condition property on eventDef, which would allow to define whether or not to consume/produce events based on their attributes and or payload.

Only consume defined event when the condition matches:

events:
  - name: onTemperatureChanged
    kind: consumed
    type: com.test/cloudevents/test
    condition: '${ .data.heater.isPoweredOn == false }'

We would therefore leverage the full power of what we already have and of cloudevents, allowing for extremely complex correlations, which is IMHO one of the core features of the spec. Hell, I came across the spec back in the days because/thanks to cloudevents!!!

cdavernas avatar Oct 27 '22 15:10 cdavernas

@cdavernas I was thinking a lot about correlation in the last months and Im not sure it is a good idea to try to simulate a correlation engine within the spec. Wont it be better to actually get rid of correlation and simplify the spec? I know the answer is no, but I need to try ;) The reason Im proposing this apparently radical idea is because I feel we are trying to do the job of a correlation engine, which should be typically done by another process (Im thinking on Napoleon and his divide&conquer strategy) and because I think we should focus on clarifying the way a flow is started through events (I know Tiho is going to write something on that regard). Anyway, if we are getting deep into correlation (as Im afraid we are going to do ;)), then my first impression is that rather than use byContextAttribute or byPayloadAttribute properties we should just use JQ expression to select the attributes that should be used for the correlation. If we are interested in a payload atttribute we should just write .data.<payload attribute> (basically what you proposed in your last post) Also, I think we need to rewrite some bits of the spec to clarify many tricky scenarios related with correlations, the one you describe is just one of them, but there are certainly more (and some of them are related with start event stated vs non start event states and how to identify an existing flow that needs to be notified) Finally, I think this topic deserve a face to face preliminary meeting to discuss all the angles and possibilities, wdyt?

fjtirado avatar Oct 27 '22 19:10 fjtirado

Complex event processing scenarios are a widespread use case and a notable research topic called Complex Event Processing. So I'd say we must be aware of that and stay away if possible. Otherwise, we end up writing a specification inside a specification.

I'd say that we need some correlation features, but they should be limited.

ricardozanini avatar Nov 14 '22 19:11 ricardozanini

@ricardozanini What I'm proposing is, I believe, reasonable and easily implemtable. I agree with @fjtirado, however, that the feature should be optional. It is imho a requirement for (complex, that is more than two even states) event based workflows.

cdavernas avatar Nov 14 '22 19:11 cdavernas

Ill work on a PR proposal that shapes my ideas as optional add ins to the spec if that ok with you guys

cdavernas avatar Nov 14 '22 19:11 cdavernas

@cdavernas, absolutely! Nothing against this proposal; we just need to be aware that if we keep adding use cases and features on top of correlations we will build another beast. :D

ricardozanini avatar Nov 14 '22 19:11 ricardozanini

@ricardozanini yeah you are right. I'll keep it as concise as possible!

cdavernas avatar Nov 14 '22 19:11 cdavernas

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Dec 30 '22 00:12 github-actions[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Feb 17 '23 00:02 github-actions[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Apr 04 '23 00:04 github-actions[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Jul 10 '23 00:07 github-actions[bot]

This has been addressed in 1.0.0-alpha1, and is closed as part of https://github.com/serverlessworkflow/specification/issues/843

cdavernas avatar May 17 '24 08:05 cdavernas