pm4py-core draft: Perf: optimize petri net replay

draft: Perf: optimize petri net replay

Open EduardoGoulart1 opened this issue 1 year ago • 3 comments

This was worked on during the Celonis Impact Day on the 29th of September 2023

This PR suggests a few changes to the Petri net semantics class to optimize the Petri net replay. The changes go from less to more radical, so it's up for discussion about which changes should be accepted. The optimizations is guided by the benchmarks. Given a Petri net (the 5 added), it measures the total runtime to compute the reachable markings of the Petri net. The results can be found below (runtime in seconds):

Model	# States	Initial Time	Decorators	Slots	Faster Hashing	Cache The ID	Iterate and Unpack Items	Native Copy	Marking as Dictionary	Compiled Semantics	Use Place IDs
M1	144	0.0076	0.0077	0.0072	0.0070	0.0065	0.0061	0.0054	0.0042	0.0019	0.0015
M2	230	0.013	0.013	0.012	0.012	0.011	0.010	0.0090	0.0067	0.0038	0.0028
M7	36740	4.69	4.83	4.56	4.24	3.83	3.48	2.99	2.18	1.30	0.94
ML2	82939	16.63	16.75	15.91	14.87	13.46	12.79	12.06	10.33	2.45	1.83
ML4	1154	0.076	0.076	0.074	0.069	0.063	0.059	0.051	0.036	0.023	0.018

If you ask me, only slots are not worth the trouble, because it will potentially break existing code (if the users have been misusing the class). The change from counter to dict would also require some rework before deploying. In the end, we would end up reimplementing the Counter Python class, but that is alright because the counter class is just very slow. The usage of place IDs instead of places for the markings could be hidden behind some nice abstractions (example: a "CompiledMarking" class)

Attention Do not merge the PR with the Petri nets (and the benchmark script)

Oct 04 '23 07:10 EduardoGoulart1

pm4py-core pm4py-core copied to clipboard

draft: Perf: optimize petri net replay

pm4py-core
pm4py-core copied to clipboard