wasmtime
wasmtime copied to clipboard
Stabilize perf results
/bench_x64
/bench_x64
Change factor shows patch effect on x64 if merged compared to current head for main.
Results are based on clocktick (CT) event cycles. Change Factor = (Patched_CT - Main_CT) / (Main_CT) A negative change factor means clockticks are expected to be reduced by the patch.
| wasm | arch | phase | change_factor |
|---|---|---|---|
| benchmarks/blake3-scalar | x86_64 | Compilation | 0.001 |
| benchmarks/blake3-simd | x86_64 | Compilation | -0.008 |
| benchmarks/bz2 | x86_64 | Compilation | -0.006 |
| benchmarks/intgemm-simd | x86_64 | Compilation | 0.004 |
| benchmarks/meshoptimizer | x86_64 | Compilation | 0.008 |
| benchmarks/noop | x86_64 | Compilation | -0.009 |
| benchmarks/pulldown-cmark | x86_64 | Compilation | -0.009 |
| benchmarks/shootout-ackermann | x86_64 | Compilation | 0.007 |
| benchmarks/shootout-base64 | x86_64 | Compilation | 0.022 |
| benchmarks/shootout-ctype | x86_64 | Compilation | -0.006 |
| benchmarks/shootout-ed25519 | x86_64 | Compilation | -0.001 |
| benchmarks/shootout-fib2 | x86_64 | Compilation | 0.018 |
| benchmarks/shootout-gimli | x86_64 | Compilation | 0.002 |
| benchmarks/shootout-heapsort | x86_64 | Compilation | 0.013 |
| benchmarks/shootout-keccak | x86_64 | Compilation | 0.002 |
| benchmarks/shootout-matrix | x86_64 | Compilation | -0.007 |
| benchmarks/shootout-memmove | x86_64 | Compilation | -0.027 |
| benchmarks/shootout-minicsv | x86_64 | Compilation | -0.011 |
| benchmarks/shootout-nestedloop | x86_64 | Compilation | -0.043 |
| benchmarks/shootout-random | x86_64 | Compilation | 0.003 |
| benchmarks/shootout-ratelimit | x86_64 | Compilation | 0.013 |
| benchmarks/shootout-seqhash | x86_64 | Compilation | -0.004 |
| benchmarks/shootout-sieve | x86_64 | Compilation | 0.011 |
| benchmarks/shootout-switch | x86_64 | Compilation | -0.006 |
| benchmarks/shootout-xblabla20 | x86_64 | Compilation | 0.051 |
| benchmarks/shootout-xchacha20 | x86_64 | Compilation | 0.018 |
| benchmarks/spidermonkey | x86_64 | Compilation | -0.002 |
| wasm | arch | phase | change_factor |
|---|---|---|---|
| benchmarks/blake3-scalar | x86_64 | Instantiation | 0.013 |
| benchmarks/blake3-simd | x86_64 | Instantiation | 0.035 |
| benchmarks/bz2 | x86_64 | Instantiation | 0.012 |
| benchmarks/intgemm-simd | x86_64 | Instantiation | -0.065 |
| benchmarks/meshoptimizer | x86_64 | Instantiation | 0.017 |
| benchmarks/noop | x86_64 | Instantiation | 0.020 |
| benchmarks/pulldown-cmark | x86_64 | Instantiation | 0.062 |
| benchmarks/shootout-ackermann | x86_64 | Instantiation | 0.022 |
| benchmarks/shootout-base64 | x86_64 | Instantiation | -0.003 |
| benchmarks/shootout-ctype | x86_64 | Instantiation | -0.057 |
| benchmarks/shootout-ed25519 | x86_64 | Instantiation | 0.006 |
| benchmarks/shootout-fib2 | x86_64 | Instantiation | 0.030 |
| benchmarks/shootout-gimli | x86_64 | Instantiation | 0.001 |
| benchmarks/shootout-heapsort | x86_64 | Instantiation | 0.021 |
| benchmarks/shootout-keccak | x86_64 | Instantiation | -0.034 |
| benchmarks/shootout-matrix | x86_64 | Instantiation | -0.083 |
| benchmarks/shootout-memmove | x86_64 | Instantiation | 0.036 |
| benchmarks/shootout-minicsv | x86_64 | Instantiation | -0.019 |
| benchmarks/shootout-nestedloop | x86_64 | Instantiation | -0.013 |
| benchmarks/shootout-random | x86_64 | Instantiation | 0.007 |
| benchmarks/shootout-ratelimit | x86_64 | Instantiation | 0.042 |
| benchmarks/shootout-seqhash | x86_64 | Instantiation | 0.001 |
| benchmarks/shootout-sieve | x86_64 | Instantiation | -0.021 |
| benchmarks/shootout-switch | x86_64 | Instantiation | -0.040 |
| benchmarks/shootout-xblabla20 | x86_64 | Instantiation | 0.025 |
| benchmarks/shootout-xchacha20 | x86_64 | Instantiation | 0.023 |
| benchmarks/spidermonkey | x86_64 | Instantiation | -0.002 |
| wasm | arch | phase | change_factor |
|---|---|---|---|
| benchmarks/blake3-scalar | x86_64 | Execution | 0.003 |
| benchmarks/blake3-simd | x86_64 | Execution | -0.014 |
| benchmarks/bz2 | x86_64 | Execution | -0.003 |
| benchmarks/intgemm-simd | x86_64 | Execution | -0.000 |
| benchmarks/meshoptimizer | x86_64 | Execution | -0.001 |
| benchmarks/noop | x86_64 | Execution | 0.008 |
| benchmarks/pulldown-cmark | x86_64 | Execution | -0.002 |
| benchmarks/shootout-ackermann | x86_64 | Execution | 0.110 |
| benchmarks/shootout-base64 | x86_64 | Execution | 0.001 |
| benchmarks/shootout-ctype | x86_64 | Execution | -0.001 |
| benchmarks/shootout-ed25519 | x86_64 | Execution | 0.003 |
| benchmarks/shootout-fib2 | x86_64 | Execution | 0.000 |
| benchmarks/shootout-gimli | x86_64 | Execution | -0.014 |
| benchmarks/shootout-heapsort | x86_64 | Execution | 0.000 |
| benchmarks/shootout-keccak | x86_64 | Execution | -0.001 |
| benchmarks/shootout-matrix | x86_64 | Execution | 0.001 |
| benchmarks/shootout-memmove | x86_64 | Execution | 0.001 |
| benchmarks/shootout-minicsv | x86_64 | Execution | 0.000 |
| benchmarks/shootout-nestedloop | x86_64 | Execution | -0.003 |
| benchmarks/shootout-random | x86_64 | Execution | 0.001 |
| benchmarks/shootout-ratelimit | x86_64 | Execution | 0.003 |
| benchmarks/shootout-seqhash | x86_64 | Execution | -0.014 |
| benchmarks/shootout-sieve | x86_64 | Execution | 0.000 |
| benchmarks/shootout-switch | x86_64 | Execution | -0.000 |
| benchmarks/shootout-xblabla20 | x86_64 | Execution | 0.037 |
| benchmarks/shootout-xchacha20 | x86_64 | Execution | -0.015 |
| benchmarks/spidermonkey | x86_64 | Execution | 0.000 |
Averages (x64):
| phase | change_factor |
|---|---|
| Compilation | 0.001 |
| Execution | 0.004 |
| Instantiation | 0.001 |
/bench_x64
/bench_x64
/bench_x64
Two ideas for bounding stability:
-
We might want to exclude instantiation time altogether from these runs. I'd prefer not to, from first principles, but they seem to have significantly more variance than the other categories. I suspect this is because instantiation is so much faster (usually) than compilation or execution. It may just be that the platform is not noise-free enough to accurately measure instantiation, and we'll need to benchmark this locally if working to improve it. Curious what others think though (@fitzgen , @alexcrichton, @abrown ?).
-
Could we run a "no-change test" as a control on every run? Basically, run the baseline twice, and show (i) the delta between the two baselines, and (ii)the delta between the baseline (either one) and the PR's change. We expect to see (in a perfect world) zero change in the control (baseline-to-baseline comparison) and whatever actual change in the diff run. If we see similar swings in both then we can conclude it's more likely noise. Thoughts?
Change factor shows patch effect on x64 if merged compared to current head for main.
Results are based on clocktick (CT) event cycles. Change Factor = (Patched_CT - Main_CT) / (Main_CT) A negative change factor means clockticks are expected to be reduced by the patch.
| wasm | arch | phase | change_factor |
|---|---|---|---|
| benchmarks/blake3-scalar | x86_64 | Compilation | 0.002 |
| benchmarks/blake3-simd | x86_64 | Compilation | 0.002 |
| benchmarks/bz2 | x86_64 | Compilation | 0.001 |
| benchmarks/intgemm-simd | x86_64 | Compilation | 0.002 |
| benchmarks/meshoptimizer | x86_64 | Compilation | 0.002 |
| benchmarks/noop | x86_64 | Compilation | -0.013 |
| benchmarks/pulldown-cmark | x86_64 | Compilation | 0.006 |
| benchmarks/shootout-ackermann | x86_64 | Compilation | 0.002 |
| benchmarks/shootout-base64 | x86_64 | Compilation | 0.000 |
| benchmarks/shootout-ctype | x86_64 | Compilation | -0.001 |
| benchmarks/shootout-ed25519 | x86_64 | Compilation | -0.004 |
| benchmarks/shootout-fib2 | x86_64 | Compilation | 0.004 |
| benchmarks/shootout-gimli | x86_64 | Compilation | -0.000 |
| benchmarks/shootout-heapsort | x86_64 | Compilation | 0.016 |
| benchmarks/shootout-keccak | x86_64 | Compilation | -0.001 |
| benchmarks/shootout-matrix | x86_64 | Compilation | -0.010 |
| benchmarks/shootout-memmove | x86_64 | Compilation | 0.007 |
| benchmarks/shootout-minicsv | x86_64 | Compilation | -0.005 |
| benchmarks/shootout-nestedloop | x86_64 | Compilation | 0.006 |
| benchmarks/shootout-random | x86_64 | Compilation | -0.000 |
| benchmarks/shootout-ratelimit | x86_64 | Compilation | 0.002 |
| benchmarks/shootout-seqhash | x86_64 | Compilation | 0.023 |
| benchmarks/shootout-sieve | x86_64 | Compilation | 0.005 |
| benchmarks/shootout-switch | x86_64 | Compilation | 0.005 |
| benchmarks/shootout-xblabla20 | x86_64 | Compilation | 0.006 |
| benchmarks/shootout-xchacha20 | x86_64 | Compilation | -0.015 |
| benchmarks/spidermonkey | x86_64 | Compilation | -0.003 |
| wasm | arch | phase | change_factor |
|---|---|---|---|
| benchmarks/blake3-scalar | x86_64 | Instantiation | 0.002 |
| benchmarks/blake3-simd | x86_64 | Instantiation | -0.008 |
| benchmarks/bz2 | x86_64 | Instantiation | -0.026 |
| benchmarks/intgemm-simd | x86_64 | Instantiation | -0.033 |
| benchmarks/meshoptimizer | x86_64 | Instantiation | -0.002 |
| benchmarks/noop | x86_64 | Instantiation | -0.014 |
| benchmarks/pulldown-cmark | x86_64 | Instantiation | 0.011 |
| benchmarks/shootout-ackermann | x86_64 | Instantiation | 0.017 |
| benchmarks/shootout-base64 | x86_64 | Instantiation | -0.015 |
| benchmarks/shootout-ctype | x86_64 | Instantiation | -0.011 |
| benchmarks/shootout-ed25519 | x86_64 | Instantiation | -0.017 |
| benchmarks/shootout-fib2 | x86_64 | Instantiation | 0.085 |
| benchmarks/shootout-gimli | x86_64 | Instantiation | -0.024 |
| benchmarks/shootout-heapsort | x86_64 | Instantiation | 0.007 |
| benchmarks/shootout-keccak | x86_64 | Instantiation | 0.029 |
| benchmarks/shootout-matrix | x86_64 | Instantiation | 0.010 |
| benchmarks/shootout-memmove | x86_64 | Instantiation | 0.071 |
| benchmarks/shootout-minicsv | x86_64 | Instantiation | 0.044 |
| benchmarks/shootout-nestedloop | x86_64 | Instantiation | -0.017 |
| benchmarks/shootout-random | x86_64 | Instantiation | 0.015 |
| benchmarks/shootout-ratelimit | x86_64 | Instantiation | 0.022 |
| benchmarks/shootout-seqhash | x86_64 | Instantiation | -0.036 |
| benchmarks/shootout-sieve | x86_64 | Instantiation | 0.030 |
| benchmarks/shootout-switch | x86_64 | Instantiation | -0.005 |
| benchmarks/shootout-xblabla20 | x86_64 | Instantiation | -0.024 |
| benchmarks/shootout-xchacha20 | x86_64 | Instantiation | 0.035 |
| benchmarks/spidermonkey | x86_64 | Instantiation | 0.059 |
| wasm | arch | phase | change_factor |
|---|---|---|---|
| benchmarks/blake3-scalar | x86_64 | Execution | -0.004 |
| benchmarks/blake3-simd | x86_64 | Execution | -0.023 |
| benchmarks/bz2 | x86_64 | Execution | 0.014 |
| benchmarks/intgemm-simd | x86_64 | Execution | 0.001 |
| benchmarks/meshoptimizer | x86_64 | Execution | -0.000 |
| benchmarks/noop | x86_64 | Execution | 0.049 |
| benchmarks/pulldown-cmark | x86_64 | Execution | 0.005 |
| benchmarks/shootout-ackermann | x86_64 | Execution | 0.026 |
| benchmarks/shootout-base64 | x86_64 | Execution | -0.001 |
| benchmarks/shootout-ctype | x86_64 | Execution | -0.001 |
| benchmarks/shootout-ed25519 | x86_64 | Execution | 0.002 |
| benchmarks/shootout-fib2 | x86_64 | Execution | -0.000 |
| benchmarks/shootout-gimli | x86_64 | Execution | -0.003 |
| benchmarks/shootout-heapsort | x86_64 | Execution | 0.000 |
| benchmarks/shootout-keccak | x86_64 | Execution | 0.001 |
| benchmarks/shootout-matrix | x86_64 | Execution | -0.004 |
| benchmarks/shootout-memmove | x86_64 | Execution | -0.000 |
| benchmarks/shootout-minicsv | x86_64 | Execution | -0.000 |
| benchmarks/shootout-nestedloop | x86_64 | Execution | 0.005 |
| benchmarks/shootout-random | x86_64 | Execution | -0.001 |
| benchmarks/shootout-ratelimit | x86_64 | Execution | -0.009 |
| benchmarks/shootout-seqhash | x86_64 | Execution | -0.003 |
| benchmarks/shootout-sieve | x86_64 | Execution | 0.000 |
| benchmarks/shootout-switch | x86_64 | Execution | 0.000 |
| benchmarks/shootout-xblabla20 | x86_64 | Execution | 0.003 |
| benchmarks/shootout-xchacha20 | x86_64 | Execution | -0.007 |
| benchmarks/spidermonkey | x86_64 | Execution | -0.003 |
Averages (x64):
| phase | change_factor |
|---|---|
| Compilation | 0.001 |
| Execution | 0.002 |
| Instantiation | 0.008 |
In my experience even with dedicated hardware I've always had a lot of noise in time-based measurements, so for long-term regression testing which this is intended for would it be possible to measure instructions retired instead of wall-time? (which I think clock-cycles is more-or-less equivalent to). That's what rust-lang/rust uses by deault and instructions are typically quite stable (although not 100% still).
Also, as a minor thing, would it be possible to print the changes as %-based changes instead of factor-based changes?
Change factor shows patch effect on x64 if merged compared to current head for main.
Results are based on clocktick (CT) event cycles. Change Factor = (Patched_CT - Main_CT) / (Main_CT) A negative change factor means clockticks are expected to be reduced by the patch.
| wasm | arch | phase | change_factor |
|---|---|---|---|
| benchmarks/blake3-scalar | x86_64 | Compilation | -0.001 |
| benchmarks/blake3-simd | x86_64 | Compilation | 0.022 |
| benchmarks/bz2 | x86_64 | Compilation | -0.008 |
| benchmarks/intgemm-simd | x86_64 | Compilation | -0.005 |
| benchmarks/meshoptimizer | x86_64 | Compilation | 0.001 |
| benchmarks/noop | x86_64 | Compilation | 0.027 |
| benchmarks/pulldown-cmark | x86_64 | Compilation | -0.000 |
| benchmarks/shootout-ackermann | x86_64 | Compilation | 0.005 |
| benchmarks/shootout-base64 | x86_64 | Compilation | -0.000 |
| benchmarks/shootout-ctype | x86_64 | Compilation | 0.018 |
| benchmarks/shootout-ed25519 | x86_64 | Compilation | -0.007 |
| benchmarks/shootout-fib2 | x86_64 | Compilation | 0.006 |
| benchmarks/shootout-gimli | x86_64 | Compilation | -0.012 |
| benchmarks/shootout-heapsort | x86_64 | Compilation | 0.006 |
| benchmarks/shootout-keccak | x86_64 | Compilation | -0.000 |
| benchmarks/shootout-matrix | x86_64 | Compilation | -0.021 |
| benchmarks/shootout-memmove | x86_64 | Compilation | 0.012 |
| benchmarks/shootout-minicsv | x86_64 | Compilation | -0.028 |
| benchmarks/shootout-nestedloop | x86_64 | Compilation | -0.007 |
| benchmarks/shootout-random | x86_64 | Compilation | 0.007 |
| benchmarks/shootout-ratelimit | x86_64 | Compilation | 0.011 |
| benchmarks/shootout-seqhash | x86_64 | Compilation | -0.011 |
| benchmarks/shootout-sieve | x86_64 | Compilation | 0.005 |
| benchmarks/shootout-switch | x86_64 | Compilation | 0.001 |
| benchmarks/shootout-xblabla20 | x86_64 | Compilation | -0.005 |
| benchmarks/shootout-xchacha20 | x86_64 | Compilation | 0.004 |
| benchmarks/spidermonkey | x86_64 | Compilation | -0.005 |
| wasm | arch | phase | change_factor |
|---|---|---|---|
| benchmarks/blake3-scalar | x86_64 | Instantiation | -0.014 |
| benchmarks/blake3-simd | x86_64 | Instantiation | 0.010 |
| benchmarks/bz2 | x86_64 | Instantiation | 0.038 |
| benchmarks/intgemm-simd | x86_64 | Instantiation | -0.035 |
| benchmarks/meshoptimizer | x86_64 | Instantiation | 0.017 |
| benchmarks/noop | x86_64 | Instantiation | -0.013 |
| benchmarks/pulldown-cmark | x86_64 | Instantiation | 0.039 |
| benchmarks/shootout-ackermann | x86_64 | Instantiation | -0.004 |
| benchmarks/shootout-base64 | x86_64 | Instantiation | -0.010 |
| benchmarks/shootout-ctype | x86_64 | Instantiation | 0.031 |
| benchmarks/shootout-ed25519 | x86_64 | Instantiation | 0.031 |
| benchmarks/shootout-fib2 | x86_64 | Instantiation | 0.028 |
| benchmarks/shootout-gimli | x86_64 | Instantiation | -0.102 |
| benchmarks/shootout-heapsort | x86_64 | Instantiation | -0.040 |
| benchmarks/shootout-keccak | x86_64 | Instantiation | 0.012 |
| benchmarks/shootout-matrix | x86_64 | Instantiation | 0.045 |
| benchmarks/shootout-memmove | x86_64 | Instantiation | -0.025 |
| benchmarks/shootout-minicsv | x86_64 | Instantiation | 0.085 |
| benchmarks/shootout-nestedloop | x86_64 | Instantiation | 0.042 |
| benchmarks/shootout-random | x86_64 | Instantiation | 0.031 |
| benchmarks/shootout-ratelimit | x86_64 | Instantiation | 0.037 |
| benchmarks/shootout-seqhash | x86_64 | Instantiation | 0.008 |
| benchmarks/shootout-sieve | x86_64 | Instantiation | 0.005 |
| benchmarks/shootout-switch | x86_64 | Instantiation | 0.050 |
| benchmarks/shootout-xblabla20 | x86_64 | Instantiation | -0.015 |
| benchmarks/shootout-xchacha20 | x86_64 | Instantiation | -0.020 |
| benchmarks/spidermonkey | x86_64 | Instantiation | -0.033 |
| wasm | arch | phase | change_factor |
|---|---|---|---|
| benchmarks/blake3-scalar | x86_64 | Execution | -0.014 |
| benchmarks/blake3-simd | x86_64 | Execution | 0.006 |
| benchmarks/bz2 | x86_64 | Execution | 0.017 |
| benchmarks/intgemm-simd | x86_64 | Execution | 0.000 |
| benchmarks/meshoptimizer | x86_64 | Execution | 0.002 |
| benchmarks/noop | x86_64 | Execution | 0.067 |
| benchmarks/pulldown-cmark | x86_64 | Execution | 0.003 |
| benchmarks/shootout-ackermann | x86_64 | Execution | -0.191 |
| benchmarks/shootout-base64 | x86_64 | Execution | -0.002 |
| benchmarks/shootout-ctype | x86_64 | Execution | -0.000 |
| benchmarks/shootout-ed25519 | x86_64 | Execution | -0.001 |
| benchmarks/shootout-fib2 | x86_64 | Execution | 0.000 |
| benchmarks/shootout-gimli | x86_64 | Execution | -0.057 |
| benchmarks/shootout-heapsort | x86_64 | Execution | 0.000 |
| benchmarks/shootout-keccak | x86_64 | Execution | -0.001 |
| benchmarks/shootout-matrix | x86_64 | Execution | 0.000 |
| benchmarks/shootout-memmove | x86_64 | Execution | -0.000 |
| benchmarks/shootout-minicsv | x86_64 | Execution | 0.000 |
| benchmarks/shootout-nestedloop | x86_64 | Execution | -0.000 |
| benchmarks/shootout-random | x86_64 | Execution | 0.000 |
| benchmarks/shootout-ratelimit | x86_64 | Execution | 0.003 |
| benchmarks/shootout-seqhash | x86_64 | Execution | -0.000 |
| benchmarks/shootout-sieve | x86_64 | Execution | 0.001 |
| benchmarks/shootout-switch | x86_64 | Execution | -0.000 |
| benchmarks/shootout-xblabla20 | x86_64 | Execution | -0.009 |
| benchmarks/shootout-xchacha20 | x86_64 | Execution | -0.003 |
| benchmarks/spidermonkey | x86_64 | Execution | -0.000 |
Averages (x64):
| phase | change_factor |
|---|---|
| Compilation | 0.001 |
| Execution | -0.007 |
| Instantiation | 0.007 |
Change factor shows patch effect on x64 if merged compared to current head for main.
Results are based on clocktick (CT) event cycles. Change Factor = (Patched_CT - Main_CT) / (Main_CT) A negative change factor means clockticks are expected to be reduced by the patch.
| wasm | arch | phase | change_factor |
|---|---|---|---|
| benchmarks/blake3-scalar | x86_64 | Compilation | 0.012 |
| benchmarks/blake3-simd | x86_64 | Compilation | -0.011 |
| benchmarks/bz2 | x86_64 | Compilation | -0.008 |
| benchmarks/intgemm-simd | x86_64 | Compilation | 0.001 |
| benchmarks/meshoptimizer | x86_64 | Compilation | 0.008 |
| benchmarks/noop | x86_64 | Compilation | 0.004 |
| benchmarks/pulldown-cmark | x86_64 | Compilation | 0.022 |
| benchmarks/shootout-ackermann | x86_64 | Compilation | -0.029 |
| benchmarks/shootout-base64 | x86_64 | Compilation | -0.009 |
| benchmarks/shootout-ctype | x86_64 | Compilation | -0.001 |
| benchmarks/shootout-ed25519 | x86_64 | Compilation | -0.003 |
| benchmarks/shootout-fib2 | x86_64 | Compilation | 0.003 |
| benchmarks/shootout-gimli | x86_64 | Compilation | 0.039 |
| benchmarks/shootout-heapsort | x86_64 | Compilation | -0.031 |
| benchmarks/shootout-keccak | x86_64 | Compilation | 0.003 |
| benchmarks/shootout-matrix | x86_64 | Compilation | 0.009 |
| benchmarks/shootout-memmove | x86_64 | Compilation | -0.003 |
| benchmarks/shootout-minicsv | x86_64 | Compilation | -0.002 |
| benchmarks/shootout-nestedloop | x86_64 | Compilation | 0.000 |
| benchmarks/shootout-random | x86_64 | Compilation | 0.006 |
| benchmarks/shootout-ratelimit | x86_64 | Compilation | 0.001 |
| benchmarks/shootout-seqhash | x86_64 | Compilation | -0.006 |
| benchmarks/shootout-sieve | x86_64 | Compilation | -0.004 |
| benchmarks/shootout-switch | x86_64 | Compilation | 0.012 |
| benchmarks/shootout-xblabla20 | x86_64 | Compilation | 0.006 |
| benchmarks/shootout-xchacha20 | x86_64 | Compilation | -0.003 |
| benchmarks/spidermonkey | x86_64 | Compilation | -0.005 |
| wasm | arch | phase | change_factor |
|---|---|---|---|
| benchmarks/blake3-scalar | x86_64 | Instantiation | 0.009 |
| benchmarks/blake3-simd | x86_64 | Instantiation | 0.070 |
| benchmarks/bz2 | x86_64 | Instantiation | 0.006 |
| benchmarks/intgemm-simd | x86_64 | Instantiation | 0.047 |
| benchmarks/meshoptimizer | x86_64 | Instantiation | 0.031 |
| benchmarks/noop | x86_64 | Instantiation | 0.030 |
| benchmarks/pulldown-cmark | x86_64 | Instantiation | -0.037 |
| benchmarks/shootout-ackermann | x86_64 | Instantiation | 0.010 |
| benchmarks/shootout-base64 | x86_64 | Instantiation | 0.035 |
| benchmarks/shootout-ctype | x86_64 | Instantiation | -0.015 |
| benchmarks/shootout-ed25519 | x86_64 | Instantiation | -0.025 |
| benchmarks/shootout-fib2 | x86_64 | Instantiation | -0.027 |
| benchmarks/shootout-gimli | x86_64 | Instantiation | 0.034 |
| benchmarks/shootout-heapsort | x86_64 | Instantiation | 0.044 |
| benchmarks/shootout-keccak | x86_64 | Instantiation | -0.023 |
| benchmarks/shootout-matrix | x86_64 | Instantiation | 0.036 |
| benchmarks/shootout-memmove | x86_64 | Instantiation | 0.043 |
| benchmarks/shootout-minicsv | x86_64 | Instantiation | -0.064 |
| benchmarks/shootout-nestedloop | x86_64 | Instantiation | -0.015 |
| benchmarks/shootout-random | x86_64 | Instantiation | -0.025 |
| benchmarks/shootout-ratelimit | x86_64 | Instantiation | 0.014 |
| benchmarks/shootout-seqhash | x86_64 | Instantiation | 0.012 |
| benchmarks/shootout-sieve | x86_64 | Instantiation | 0.025 |
| benchmarks/shootout-switch | x86_64 | Instantiation | -0.070 |
| benchmarks/shootout-xblabla20 | x86_64 | Instantiation | -0.021 |
| benchmarks/shootout-xchacha20 | x86_64 | Instantiation | -0.018 |
| benchmarks/spidermonkey | x86_64 | Instantiation | -0.059 |
| wasm | arch | phase | change_factor |
|---|---|---|---|
| benchmarks/blake3-scalar | x86_64 | Execution | -0.023 |
| benchmarks/blake3-simd | x86_64 | Execution | 0.001 |
| benchmarks/bz2 | x86_64 | Execution | -0.004 |
| benchmarks/intgemm-simd | x86_64 | Execution | -0.002 |
| benchmarks/meshoptimizer | x86_64 | Execution | 0.001 |
| benchmarks/noop | x86_64 | Execution | 0.100 |
| benchmarks/pulldown-cmark | x86_64 | Execution | -0.003 |
| benchmarks/shootout-ackermann | x86_64 | Execution | -0.080 |
| benchmarks/shootout-base64 | x86_64 | Execution | -0.001 |
| benchmarks/shootout-ctype | x86_64 | Execution | -0.000 |
| benchmarks/shootout-ed25519 | x86_64 | Execution | -0.000 |
| benchmarks/shootout-fib2 | x86_64 | Execution | -0.000 |
| benchmarks/shootout-gimli | x86_64 | Execution | -0.015 |
| benchmarks/shootout-heapsort | x86_64 | Execution | -0.000 |
| benchmarks/shootout-keccak | x86_64 | Execution | -0.004 |
| benchmarks/shootout-matrix | x86_64 | Execution | -0.001 |
| benchmarks/shootout-memmove | x86_64 | Execution | 0.000 |
| benchmarks/shootout-minicsv | x86_64 | Execution | -0.001 |
| benchmarks/shootout-nestedloop | x86_64 | Execution | 0.003 |
| benchmarks/shootout-random | x86_64 | Execution | -0.000 |
| benchmarks/shootout-ratelimit | x86_64 | Execution | 0.003 |
| benchmarks/shootout-seqhash | x86_64 | Execution | 0.009 |
| benchmarks/shootout-sieve | x86_64 | Execution | 0.000 |
| benchmarks/shootout-switch | x86_64 | Execution | -0.001 |
| benchmarks/shootout-xblabla20 | x86_64 | Execution | 0.003 |
| benchmarks/shootout-xchacha20 | x86_64 | Execution | 0.003 |
| benchmarks/spidermonkey | x86_64 | Execution | 0.002 |
Averages (x64):
| phase | change_factor |
|---|---|
| Compilation | 0.001 |
| Execution | -0.000 |
| Instantiation | 0.002 |
Two ideas for bounding stability:
- We might want to exclude instantiation time altogether from these runs. I'd prefer not to, from first principles, but they seem to have significantly more variance than the other categories. I suspect this is because instantiation is so much faster (usually) than compilation or execution. It may just be that the platform is not noise-free enough to accurately measure instantiation, and we'll need to benchmark this locally if working to improve it. Curious what others think though (@fitzgen , @alexcrichton, @abrown ?).
Seems fine to exclude instantiation. We have decent instantiation benchmarks in criterion anyways.
- Could we run a "no-change test" as a control on every run? Basically, run the baseline twice, and show (i) the delta between the two baselines, and (ii)the delta between the baseline (either one) and the PR's change. We expect to see (in a perfect world) zero change in the control (baseline-to-baseline comparison) and whatever actual change in the diff run. If we see similar swings in both then we can conclude it's more likely noise. Thoughts?
This is more something for the sightglass-analysis crate than the github bot, IMO. The github bot shouldn't be growing anything other than what is needed to run sightglass on the server, authenticate who is allowed to do that, and report the results back. All the details of actually running benchmarks and doing analysis on them should be in sightglass itself.
This is more something for the sightglass-analysis crate than the github bot, IMO. The github bot shouldn't be growing anything other than what is needed to run sightglass on the server, authenticate who is allowed to do that, and report the results back. All the details of actually running benchmarks and doing analysis on them should be in sightglass itself.
Yeah, that's a good point actually; I agree. My main concern was that we have trustworthy results and actually using the confidence-interval computation is the best way of doing that.
(And following on that a bit more, I guess what I really want is to sort of build up trust in the tool from first principles -- that's what I was trying to get at with the null-diff control; so perhaps this is a way we can validate the confidence interval reporting, when we get it integrated. If we submit an empty PR and benchmark it, we should see "no statistical difference" everywhere, or else we have a stats or configuration/setup bug)
(Note that the probability of a false positive is 1% (due to our default significance level) but this is per test and we do 3 tests per Wasm input so we only need to have ~33 Wasm inputs to expect one false positive per benchmark run. One of the many reasons to choose our Wasm inputs carefully.)