substrate Benchmark `System::set

Currently we hard-code weights for System::set_code and System::set_code_without_checks:

#[pallet::call_index(2)]
#[pallet::weight((T::BlockWeights::get().max_block, DispatchClass::Operational))]
pub fn set_code(origin: OriginFor<T>, code: Vec<u8>) -> DispatchResultWithPostInfo {

This is inflexible and always results in un-schedulable runtime upgrades. We should benchmark them instead.

cc @gavofyork

Jan 20 '23 14:01 ggwpez

@shawntabrizi said that he had issues back in the days when he tried it, but he didn't really remembered them.

Jan 20 '23 15:01 bkchr

Can I try it? So we need a codesize-dependent benchmark here?

Jan 23 '23 10:01 hirschenberger

Can I try it? So we need a codesize-dependent benchmark here?

Yes.

Jan 23 '23 10:01 ggwpez

@shawntabrizi left this comment. I doubt that I can set up a benchmark if even Shawn failed.

But the comment is from 04/2020, maybe something changed and it's easier now.

Jan 24 '23 08:01 hirschenberger

You should try some different code sizes of runtimes, but it should not change the behavior of the benchmark. We don't iterate the wasm code or something like this, so the size should not change the time it takes to execute the function. For sure, there is for sure some impact, but I don't think it is measurable or really changes the numbers.

Jan 24 '23 11:01 bkchr

it is pretty hard to come up with a real Wasm runtime to test the upgrade with

The set_code parses the version and then puts it into storage. Can you not mock that and use a byte-block with a prepended RuntimeVersion?

Jan 24 '23 12:01 ggwpez

The set_code parses the version and then puts it into storage

It is not doing that :P It is reading the version to compare it against the current version. The LastRuntimeUpgrade version is changed when the new runtime runs the first time, so not really important for this function. In the past we still needed to instantiate the runtime to fetch the version, but now we put the runtime version into a section of the wasm binary and reading this should be almost constant time (the only thing here being different is the decompression time of the wasm binary)

Jan 24 '23 12:01 bkchr

And said decompression time, is it dependent on the content? So if I use a vec with constant data, does it have a different runtime characteristic than a vec with random data?

Jan 24 '23 13:01 hirschenberger

I would propose you just collect some runtimes. We have 2 runtimes in Substrate itself, 4 in Polkadot and even more in Cumulus. If you compile all of them and use them for the benchmarking you are probably good. Maybe you could also use Acala&/Moonbeam as I remember that they had relative big runtimes.

So if I use a vec with constant data, does it have a different runtime characteristic than a vec with random data?

Probably, no compression expert. :D

Jan 24 '23 13:01 bkchr

I don't understand. I should compile and run the benchmarks on my system and just add my determined weight-fn for set_code?

Or should I collect the runtimes somewhere (commit the blobs to git 🤔 ) and let the benchmarks run on CI hardware like a normal benchmark.

Is it really possible to create a good linear regression model with these few datapoints?

Jan 24 '23 14:01 hirschenberger

Or should I collect the runtimes somewhere (commit the blobs to git thinking ) and let the benchmarks run on CI hardware like a normal benchmark.

This one.

Is it really possible to create a good linear regression model with these few datapoints?

I meant this as some sort of starting point to get a "feeling" on how the size of the data influences the weight. Later we can still think of adding some fake data to the wasm file or something like that to increase the size.

Jan 24 '23 14:01 bkchr

Ok, what would be a good place in the repo for the runtimes. How about including them in the binary with include_bytes! when compiling with the runtime-benchmarks feature?

Jan 24 '23 14:01 hirschenberger

How about including them in the binary with include_bytes! when compiling with the runtime-benchmarks feature?

Sounds good.

Ok, what would be a good place in the repo for the runtimes.

Somewhere in the frame-system crate, but we first should find out if we really need by doing some experiments. We can later still come up with a good place for them.

Jan 24 '23 14:01 bkchr

I ran into a problem. When I want to use a prebuilt binary for the benchmark, I get the error:

Input("Benchmark frame_system::set_code failed: InvalidSpecName")

This error originates from Pallet::can_set_code where the spec_name and spec_version are checked.

Is there a tool to manipulate these values in the binary files? If not, would it make sense to develop such a tool?

Jan 26 '23 13:01 hirschenberger

If you just test them one-by-one, wouldn't it be enough to change your runtime version to match it in the mock.rs? https://github.com/paritytech/substrate/blob/3e71d606b7d1e91d9c1701c0a443530eefca1a39/frame/system/src/mock.rs#L49

Jan 26 '23 15:01 ggwpez

Or disable the check when runtime-benchmarks feature flag is set.

Jan 26 '23 15:01 Neopallium

If you just test them one-by-one, wouldn't it be enough to change your runtime version to match it in the mock.rs?

https://github.com/paritytech/substrate/blob/3e71d606b7d1e91d9c1701c0a443530eefca1a39/frame/system/src/mock.rs#L49

In the tests it seems to be system-test but when I run the benchmarks from the substrate binary, the impl_name is substrate-node.

Jan 26 '23 15:01 hirschenberger

Or disable the check when runtime-benchmarks feature flag is set.

Which falsifies the benchmarks slightly bc. of skipped code.

Jan 26 '23 15:01 hirschenberger

If you run the benchmarks in the node you would have to change it in the node runtime then: https://github.com/paritytech/substrate/blob/2dbf62591fd35725c6e39f8ddfd27c3695dab594/bin/node/runtime/src/lib.rs#L128

Jan 26 '23 15:01 ggwpez

Is it possible to change that in runtime? Isn't it a compiled-in constant.

Jan 26 '23 15:01 hirschenberger

Is it possible to change that in runtime? Isn't it a compiled-in constant.

Why do you need to change it in the runtime? I thought you were doing one-off testing with different WASM blobs.

Jan 26 '23 15:01 ggwpez

Is it possible to change that in runtime? Isn't it a compiled-in constant.

Why do you need to change it in the runtime? I thought you were doing one-off testing with different WASM blobs.

Ah, I missunderstood you. You meant I should change it for crafting my test wasm runtimes?

One issue is that the names differ dependent if you run the benchmarks or the benchmark-tests, so one will fail.

Jan 26 '23 15:01 hirschenberger

I think the "slowest part" of set_code is the calling into the runtime to get the runtime version of the new wasm blob. As long as we do this, we can probably disable the checks when the runtime-benchmarks feature is enabled.

Jan 26 '23 20:01 bkchr

I think the "slowest part" of set_code is the calling into the runtime to get the runtime version of the new wasm blob. As long as we do this, we can probably disable the checks when the runtime-benchmarks feature is enabled.

But isn't this a possible footgun? If someone compiles production code with the runtime-benchmarks feature, he accidently disables these validity checks.

Jan 26 '23 20:01 hirschenberger

This check is just there to prevent human errors, not a crucial check. People can also use set_storage or set_code_without_checks to circumvent these checks.

Jan 26 '23 20:01 bkchr

Now I have an array of runtime-blobs which I iterate like this in the benchmark setup fn.

let i in 0 .. (RUNTIMES.len()-1) as u32;

But i is not the correct complexity-parameter for the weight function, it's RUNTIMES[i].len()

Is it possible to iterate and remap i ?

Jan 27 '23 14:01 hirschenberger

The first manually evaluated results (on my weak notebook).

Repeat: 20 Steps: 50

runtime	size(kb)	mean µs	sigma µs	%
kitchensink	4152	65970	556.5	0.8%
kusama	1376	43110	155.1	0.3%
polkadot	1232	38140	528.7	1.3%
westend	1132	35970	55.72	0.1%
rococo	1120	37060	74.91	0.2%
moonbeam	1252	43510	66.85	0.1%
moonriver	1304	45930	61.84	0.1%
moonbase	1332	46950	35.26	0.0%

Jan 28 '23 10:01 hirschenberger

Okay thanks for the results. 65ms is really fast as compared to the 1.5 secs that we currently estimate.
If the difference between a small and a large runtime is just 40%, then we can probably use the large one and dont benchmark with a component.
It is also very fortunate that the kitchensink is the largest, so we dont need to pull in external deps.

But i is not the correct complexity-parameter for the weight function, it's RUNTIMES[i].len()

Hm, no its not to my knowledge… but you could map i to the runtime with the closest size.

Jan 28 '23 14:01 ggwpez

So I'd change my bench to just use the kitchensink runtime and we assume that's the maximum size?

BTW: Wouldn't it be nice to have more control over the components than just using a range of values. How about adding an additional iterator interface to manually define sample points? Or add a mapping function to iterate and remap sample points?

Jan 30 '23 10:01 hirschenberger

So if I only benchmark the kitchensink, I could also take the wasm-binary from the target dir and we don't have to include an extra runtime just for testing.

What's the canonical way to find the filesystem-path for the runtime? Is there some function to query it or should I use cargo environment variables?

Feb 02 '23 09:02 hirschenberger

substrate
substrate copied to clipboard

Benchmark `System::set_code`

substrate substrate copied to clipboard

Benchmark `System::set_code`

substrate
substrate copied to clipboard