substrate icon indicating copy to clipboard operation
substrate copied to clipboard

Warn validators with slow hardware

Open ggwpez opened this issue 3 years ago • 8 comments

Upon starting a node we currently print some brief hardware metrics.
This could be extended to include a warning in case the node has bad performance and was started with --validator.

The logic for comparing results and having a baseline is currently in utils/frame/benchmarking-cli, some of this would probably move to sc-sysinfo. sc-sysinfo would profit from this since it can use Throughput types (and other) instead of just u64 in some places.

ggwpez avatar Aug 12 '22 14:08 ggwpez

@ordian Maybe this can be helpful in view of the disputes slashing roll-out as well?

dvdplm avatar Aug 17 '22 07:08 dvdplm

We already got the host performance check in Polkadot for this: https://github.com/paritytech/polkadot/blob/master/cli/src/host_perf_check.rs

bkchr avatar Aug 17 '22 08:08 bkchr

We already got the host performance check in Polkadot for this …

Do you mean that we should not add it to Polkadot then, but still to Substrate? Or not at all?
I keep seeing people in the Kusama 1KV channel actually using the benchmark machine command, so they seem to like it.

ggwpez avatar Aug 17 '22 09:08 ggwpez

We can add it to Substrate, if it is done in a generic way. Aka chain implementors like Polkadot provide their own expected numbers that are treated as "good".

My comment was more an answer to @dvdplm.

But if we have your stuff, we can deprecate the host perf check.

bkchr avatar Aug 17 '22 11:08 bkchr

@ggwpez I am gonna pick this up :)

Szegoo avatar Sep 27 '22 15:09 Szegoo

@ggwpez You mentioned in https://github.com/paritytech/substrate/pull/12368 that we could move Metric to sc-sysinfo also, I will open a small PR for that if you didn't change your mind.

Szegoo avatar Oct 15 '22 12:10 Szegoo

Please try to do it in one. I dont want to increase the review overhead for such a rather simple issue @Szegoo

ggwpez avatar Oct 15 '22 16:10 ggwpez

@ggwpez Ok, I will close this issue and move Metric in the same PR.

Szegoo avatar Oct 15 '22 16:10 Szegoo

@ggwpez Should we deprecate the host perf check in polkadot then and have it in substrate implemented in a generic way?

Szegoo avatar Nov 05 '22 19:11 Szegoo

I have to look into the PVF check command again and chat with who created it. For now it is still fine to have both IMO.

ggwpez avatar Nov 05 '22 22:11 ggwpez