engine icon indicating copy to clipboard operation
engine copied to clipboard

Process hash calculation

Open antho1404 opened this issue 4 years ago • 2 comments

The hash of the process is based on the order of the nodes/edges. This doesn't entirely define a process as any order for the edges or nodes will result in the exact same graph.

The 2 following graphs have the same behavior but have different hash

nodes:
  - eventA
  - taskA
edges:
  from: eventA, to: taskA
nodes:
  - taskA
  - eventA
edges:
  from: eventA, to: taskA

One way to solve that is to sort the nodes and edges to make sure that we don't rely on the order given by the user but a deterministic order that we define (based on the node key for example).

Advantages of considering these 2 processes as the same:

  • Doesn't rely on the user's input so whatever compiler is used the result will always be the same
  • Giving us control on the order of the edges/nodes so we can rearrange the graph to make it more efficient

Disadvantages:

  • No optimization possible from the user, reordering some nodes might result in a tiny micro-optimization...

Originally posted by @antho1404 in https://github.com/mesg-foundation/engine/timeline

antho1404 avatar Mar 06 '20 02:03 antho1404

So I like the idea of keeping the same hash for the same logical data.

Option to consider - instead of storing the array, we can change the types from array to map. In that case, we already have sorting plus it give us the possibility to extend the node in the future.

nodes:
  eventA:
  taskA:
edges:
  from: eventA, to: taskA

No optimization possible from the user, reordering some nodes might result in a tiny micro-optimization...

Hum I don't think it is worth it... :)

krhubert avatar Mar 06 '20 05:03 krhubert

I'm not sure this proposition really makes sense from a data structure point of view. It uses an array, so from a structure point of view, the order is important. If it's not, then a map should be used. Maybe currently the order is not important (actually the first must be a trigger, isn't it?) but the data structure tells it's important. I would not put custom order on this array because for this specific case it could make sense. What about if we change the key system? What about now the order is important? I think the hash-serialization should not be built with exceptions. The compiler should order it by key if it wants to, not the hash-serializer.

NicolasMahe avatar Mar 09 '20 04:03 NicolasMahe