polkadot
polkadot copied to clipboard
Fine grained CPU profiling
There have been efforts to solve this earlier like https://github.com/paritytech/polkadot/pull/4871 but the results we got were not providing enough insights due to the low sampling rate/storage - https://pyroscope.io/docs/storage-design/ . We should continue the effort to implement something that works better for our usecase. We need as fine grained as possible CPU profiling (100us) with visualisation tooling to increase accuracy and decrease the scope of debugging when dealing with node performance issues or optimization work.
The solution should also consider this must also work easily with Zombienet to test performance regression in the CI pipeline.
Just to clarity, the profiler frequency used in pyroscope is configurable. The problem is that it accumulates the profiling info into segments of 10s, which is hardcoded in their server in many places. Cf https://github.com/pyroscope-io/pyroscope/issues/901. Thus the critical section we want to profile gets lost in the noise of other tasks such as networking.
Also worth mentioning that we don't need to run it on every node. One validator and one collator per parachain would be fine.
We could try another tool in the same category of continuous profilers or try to fork their server.