wasmtime
wasmtime copied to clipboard
Testing .. do not merge
I think this secret may not be on this repo yet, do you have access to add it or would you like me to add one?
/bench_x64
I think this secret may not be on this repo yet, do you have access to add it or would you like me to add one?
@alexcrichton Forked branches don't have access to the secrets. Looks like a road block as neither clone or pushing will be authorized.
https://github.com/actions/checkout/issues/298 I am wondering if the issue_comment trigger will allow reading the secret but the problem is the PR associated with the issue_comment is not made available via the event payload .. which is why we switched to using the pull_request_review event.
I'm not sure I understand, why would we want forked branches to have access to anything? I think what's in the repo works, but https://github.com/bytecodealliance/wasmtime/pull/4919 needs to merge so the main branch reflects correct secret configuration.
@alexcrichton I stumbled onto that link as the workflow/action was failing with the same error and I thought the conversation there made sense to apply here. However maybe I am wrong and that policy being referred to has nothing to do with the situation here. This is the line that I thought was relevant:
Secrets are not passed to workflows that are triggered by a pull request from a fork. Learn more.
It is written here: https://docs.github.com/en/actions/security-guides/encrypted-secrets#using-encrypted-secrets-in-a-workflow
Our workflow is triggered by pull_request_review and and the issue that I am seeing is that it appears that when the workflow, triggered that the secret is in fact not available. When I tested this I had to test on my own repo (forked from wasmtime), but the PR was not from a branch on a fork .. it was from a branch from the same repository that I was trying to merge as a PR into the default branch, so the secret was available. The patch is failing to run due to errors trying to access the private repo. Errors such as:
fatal: could not read Username for 'https://github.com/': No such device or address
Cloning into 'wasmtime-sightglass-benchmarking'... fatal: could not read Password for 'https://***@github.com': No such device or address Error: Process completed with exit code 128.
Cloning into 'wasmtime-sightglass-benchmarking'... fatal: could not read Username for 'https://github.com/': No such device or address Error: Process completed with exit code 128.
are seen when the token/secret is empty which is what I was seeing, or I am seeing: remote: Invalid username or password. fatal: Authentication failed for 'https://github.com/bytecodealliance/wasmtime-sightglass-benchmarking.git/'
Which error depends on the various things I tried where I've lost track of which error is from which change I tried. So, I can't remember which one, but I believe one of these is consistent with the token/secret string being empty. The bottom line is we currently can't clone or push to the performance repo due to lack of permission. The secret is either not available or not working. I need to put some test in to confirm that it is simply not available, but the line "Secrets are not passed to workflows that are triggered by a pull request from a fork." I think makes sense as the culprit.
I think this "just" needs to merge https://github.com/bytecodealliance/wasmtime/pull/4919, rebase this PR, and try again.
The pull_request_review trigger runs the workflow as-defined on main. As-defined on main the workflow is accessing SIGHTGLASS_BENCHMARKING_TOKEN which is not defined by the Wasmtime repo at this time. I believe this is why it's appearing that you don't have access to secrets.
If my suspicions are correct then after merging #4919 this should work.
I think this "just" needs to merge #4919, rebase this PR, and try again.
The
pull_request_reviewtrigger runs the workflow as-defined onmain. As-defined onmainthe workflow is accessingSIGHTGLASS_BENCHMARKING_TOKENwhich is not defined by the Wasmtime repo at this time. I believe this is why it's appearing that you don't have access to secrets.If my suspicions are correct then after merging #4919 this should work.
I wish it were more convenient to test workflows and actions before merging but merging and just trying is probably the most efficient way to confirm. At least this doesn't run unless manually triggered so it shouldn't cause disruption even if it fails again.
I used an equivalent token from the account that PERSONAL_ACCESS_TOKEN is connected to and I locally ran git clone ... and it worked, so I don't think it's token permissions at this point.
That the run is still failing seems to mean that pull_request_review actions don't have access to secrets. I... guess? That seems to confirm your hypothesis above which would unfortunately render most of this moot... To confirm that though, do you want to try sending a PR to this repo, from this repo, and see if the benchmarking works there?
I used an equivalent token from the account that
PERSONAL_ACCESS_TOKENis connected to and I locally rangit clone ...and it worked, so I don't think it's token permissions at this point.That the run is still failing seems to mean that
pull_request_reviewactions don't have access to secrets. I... guess? That seems to confirm your hypothesis above which would unfortunately render most of this moot... To confirm that though, do you want to try sending a PR to this repo, from this repo, and see if the benchmarking works there?
You mean to just push a branch from origin https://github.com/bytecodealliance/wasmtime.git and try from a PR on that branch? I don't have permission to push new branches .. maybe you could try? That would confirm things. If it is confirmed there are a couple of other ideas we can maybe try.
I've opened https://github.com/bytecodealliance/wasmtime/pull/4928 to test.
While waiting for the CI runner it says:
Job defined at: bytecodealliance/wasmtime/.github/workflows/performance.yml@refs/pull/4928/merge
so if it's using the performance.yml from the merge commit instead of main then it definitely isn't going to have access to secrets. (since PRs could arbitrarily put whatever they want in workflow configuration files).
Other events like issue_comment may guarantee a run from main which would probably have access to secrets. Figuring out the PR context will be a bit nontrivial though.
While waiting for the CI runner it says:
Job defined at: bytecodealliance/wasmtime/.github/workflows/performance.yml@refs/pull/4928/mergeso if it's using the
performance.ymlfrom the merge commit instead ofmainthen it definitely isn't going to have access to secrets. (since PRs could arbitrarily put whatever they want in workflow configuration files).Other events like
issue_commentmay guarantee a run frommainwhich would probably have access to secrets. Figuring out the PR context will be a bit nontrivial though.
Looks like it ran (https://github.com/bytecodealliance/wasmtime/pull/4928) .. so that confirms things. Yes, it makes sense that forks should not have access to those secrets. I was thinking either issue_comment and using github-script https://github.com/actions/github-script (such as in this example https://lightrun.com/answers/actions-checkout-any-way-to-checkout-pr-from-issue_comment-event) to figure it out, could work. Or setting up a workflow dispatch trigger may work. https://github.blog/changelog/2020-07-06-github-actions-manual-triggers-with-workflow_dispatch/ where the actor has to input their token or password into the gui and they'd need to have permission at the performance repo. I think the first option is the most desired.
/bench_x64
/bench_x64
/bench_x64
@alexcrichton This needed to be rebased in order to use the latest yml. I've done that and submitted another test trigger.
Lol .. and accidently submitted another in my reply.
Shows pct_change on x64 for the patch if merged compared to current head for main.
Pct_change is based on clocktick event cycles where the benchmarks are run with Sightglass. A negative pct_change means clockticks are expected to be reduced for the benchmark, for that phase, and by that factor, if the patch were merged (i.e. negative is good).
| wasm | arch | phase | pct_change |
|---|---|---|---|
| benchmarks/blake3-scalar | x86_64 | Compilation | -0.034287 |
| benchmarks/blake3-scalar | x86_64 | Execution | -0.011487 |
| benchmarks/blake3-scalar | x86_64 | Instantiation | 0.076832 |
| benchmarks/blake3-simd | x86_64 | Compilation | -0.022895 |
| benchmarks/blake3-simd | x86_64 | Execution | 0.001343 |
| benchmarks/blake3-simd | x86_64 | Instantiation | -0.134462 |
| benchmarks/bz2 | x86_64 | Compilation | -0.051869 |
| benchmarks/bz2 | x86_64 | Execution | 0.008466 |
| benchmarks/bz2 | x86_64 | Instantiation | -0.053237 |
| benchmarks/intgemm-simd | x86_64 | Compilation | 0.000813 |
| benchmarks/intgemm-simd | x86_64 | Execution | 0.000096 |
| benchmarks/intgemm-simd | x86_64 | Instantiation | 0.028721 |
| benchmarks/meshoptimizer | x86_64 | Compilation | -0.007054 |
| benchmarks/meshoptimizer | x86_64 | Execution | -0.001786 |
| benchmarks/meshoptimizer | x86_64 | Instantiation | 0.095982 |
| benchmarks/noop | x86_64 | Compilation | -0.009973 |
| benchmarks/noop | x86_64 | Execution | 0.403306 |
| benchmarks/noop | x86_64 | Instantiation | 0.007373 |
| benchmarks/pulldown-cmark | x86_64 | Compilation | 0.027008 |
| benchmarks/pulldown-cmark | x86_64 | Execution | -0.016214 |
| benchmarks/pulldown-cmark | x86_64 | Instantiation | -0.013706 |
| benchmarks/shootout-ackermann | x86_64 | Compilation | 0.035487 |
| benchmarks/shootout-ackermann | x86_64 | Execution | -0.053356 |
| benchmarks/shootout-ackermann | x86_64 | Instantiation | 0.001452 |
| benchmarks/shootout-base64 | x86_64 | Compilation | 0.013889 |
| benchmarks/shootout-base64 | x86_64 | Execution | 0.000187 |
| benchmarks/shootout-base64 | x86_64 | Instantiation | -0.006239 |
| benchmarks/shootout-ctype | x86_64 | Compilation | -0.034853 |
| benchmarks/shootout-ctype | x86_64 | Execution | -0.001521 |
| benchmarks/shootout-ctype | x86_64 | Instantiation | -0.014821 |
| benchmarks/shootout-ed25519 | x86_64 | Compilation | -0.007662 |
| benchmarks/shootout-ed25519 | x86_64 | Execution | -0.001859 |
| benchmarks/shootout-ed25519 | x86_64 | Instantiation | -0.125123 |
| benchmarks/shootout-fib2 | x86_64 | Compilation | 0.045457 |
| benchmarks/shootout-fib2 | x86_64 | Execution | 0.000295 |
| benchmarks/shootout-fib2 | x86_64 | Instantiation | -0.023764 |
| benchmarks/shootout-gimli | x86_64 | Compilation | 0.002144 |
| benchmarks/shootout-gimli | x86_64 | Execution | -0.018011 |
| benchmarks/shootout-gimli | x86_64 | Instantiation | -0.042722 |
| benchmarks/shootout-heapsort | x86_64 | Compilation | 0.019075 |
| benchmarks/shootout-heapsort | x86_64 | Execution | 0.000060 |
| benchmarks/shootout-heapsort | x86_64 | Instantiation | -0.061197 |
| benchmarks/shootout-keccak | x86_64 | Compilation | 0.003209 |
| benchmarks/shootout-keccak | x86_64 | Execution | 0.001870 |
| benchmarks/shootout-keccak | x86_64 | Instantiation | 0.020665 |
| benchmarks/shootout-matrix | x86_64 | Compilation | 0.012603 |
| benchmarks/shootout-matrix | x86_64 | Execution | 0.005769 |
| benchmarks/shootout-matrix | x86_64 | Instantiation | 0.004154 |
| benchmarks/shootout-memmove | x86_64 | Compilation | -0.020977 |
| benchmarks/shootout-memmove | x86_64 | Execution | 0.000839 |
| benchmarks/shootout-memmove | x86_64 | Instantiation | 0.007037 |
| benchmarks/shootout-minicsv | x86_64 | Compilation | 0.003152 |
| benchmarks/shootout-minicsv | x86_64 | Execution | 0.001506 |
| benchmarks/shootout-minicsv | x86_64 | Instantiation | -0.053419 |
| benchmarks/shootout-nestedloop | x86_64 | Compilation | -0.009232 |
| benchmarks/shootout-nestedloop | x86_64 | Execution | -0.023647 |
| benchmarks/shootout-nestedloop | x86_64 | Instantiation | -0.015899 |
| benchmarks/shootout-random | x86_64 | Compilation | -0.009856 |
| benchmarks/shootout-random | x86_64 | Execution | -0.000872 |
| benchmarks/shootout-random | x86_64 | Instantiation | -0.078992 |
| benchmarks/shootout-ratelimit | x86_64 | Compilation | -0.023974 |
| benchmarks/shootout-ratelimit | x86_64 | Execution | 0.012502 |
| benchmarks/shootout-ratelimit | x86_64 | Instantiation | -0.022845 |
| benchmarks/shootout-seqhash | x86_64 | Compilation | 0.004471 |
| benchmarks/shootout-seqhash | x86_64 | Execution | -0.000565 |
| benchmarks/shootout-seqhash | x86_64 | Instantiation | 0.000801 |
| benchmarks/shootout-sieve | x86_64 | Compilation | -0.065953 |
| benchmarks/shootout-sieve | x86_64 | Execution | 0.000828 |
| benchmarks/shootout-sieve | x86_64 | Instantiation | 0.030104 |
| benchmarks/shootout-switch | x86_64 | Compilation | -0.011299 |
| benchmarks/shootout-switch | x86_64 | Execution | 0.000224 |
| benchmarks/shootout-switch | x86_64 | Instantiation | 0.002284 |
| benchmarks/shootout-xblabla20 | x86_64 | Compilation | 0.034352 |
| benchmarks/shootout-xblabla20 | x86_64 | Execution | -0.015639 |
| benchmarks/shootout-xblabla20 | x86_64 | Instantiation | -0.158237 |
| benchmarks/shootout-xchacha20 | x86_64 | Compilation | 0.003882 |
| benchmarks/shootout-xchacha20 | x86_64 | Execution | -0.010573 |
| benchmarks/shootout-xchacha20 | x86_64 | Instantiation | 0.051459 |
| benchmarks/spidermonkey | x86_64 | Compilation | -0.005556 |
| benchmarks/spidermonkey | x86_64 | Execution | -0.001852 |
| benchmarks/spidermonkey | x86_64 | Instantiation | 0.004107 |
Oh .. guess it did work
Shows pct_change on x64 for the patch if merged compared to current head for main.
Pct_change is based on clocktick event cycles where the benchmarks are run with Sightglass. A negative pct_change means clockticks are expected to be reduced for the benchmark, for that phase, and by that factor, if the patch were merged (i.e. negative is good).
| wasm | arch | phase | pct_change |
|---|---|---|---|
| benchmarks/blake3-scalar | x86_64 | Compilation | -0.030310 |
| benchmarks/blake3-scalar | x86_64 | Execution | 0.023858 |
| benchmarks/blake3-scalar | x86_64 | Instantiation | 0.032645 |
| benchmarks/blake3-simd | x86_64 | Compilation | 0.251867 |
| benchmarks/blake3-simd | x86_64 | Execution | 0.111039 |
| benchmarks/blake3-simd | x86_64 | Instantiation | 0.096104 |
| benchmarks/bz2 | x86_64 | Compilation | -0.044554 |
| benchmarks/bz2 | x86_64 | Execution | -0.001013 |
| benchmarks/bz2 | x86_64 | Instantiation | 0.058163 |
| benchmarks/intgemm-simd | x86_64 | Compilation | -0.007175 |
| benchmarks/intgemm-simd | x86_64 | Execution | 0.001357 |
| benchmarks/intgemm-simd | x86_64 | Instantiation | -0.005574 |
| benchmarks/meshoptimizer | x86_64 | Compilation | 0.004291 |
| benchmarks/meshoptimizer | x86_64 | Execution | -0.000236 |
| benchmarks/meshoptimizer | x86_64 | Instantiation | -0.076082 |
| benchmarks/noop | x86_64 | Compilation | 0.051348 |
| benchmarks/noop | x86_64 | Execution | 0.074605 |
| benchmarks/noop | x86_64 | Instantiation | 0.104645 |
| benchmarks/pulldown-cmark | x86_64 | Compilation | 0.001547 |
| benchmarks/pulldown-cmark | x86_64 | Execution | -0.002676 |
| benchmarks/pulldown-cmark | x86_64 | Instantiation | 0.016475 |
| benchmarks/shootout-ackermann | x86_64 | Compilation | 0.077409 |
| benchmarks/shootout-ackermann | x86_64 | Execution | 0.164519 |
| benchmarks/shootout-ackermann | x86_64 | Instantiation | 0.038244 |
| benchmarks/shootout-base64 | x86_64 | Compilation | 0.018951 |
| benchmarks/shootout-base64 | x86_64 | Execution | -0.003764 |
| benchmarks/shootout-base64 | x86_64 | Instantiation | 0.059158 |
| benchmarks/shootout-ctype | x86_64 | Compilation | -0.044444 |
| benchmarks/shootout-ctype | x86_64 | Execution | 0.001686 |
| benchmarks/shootout-ctype | x86_64 | Instantiation | 0.019625 |
| benchmarks/shootout-ed25519 | x86_64 | Compilation | -0.007561 |
| benchmarks/shootout-ed25519 | x86_64 | Execution | -0.000823 |
| benchmarks/shootout-ed25519 | x86_64 | Instantiation | -0.032482 |
| benchmarks/shootout-fib2 | x86_64 | Compilation | 0.004821 |
| benchmarks/shootout-fib2 | x86_64 | Execution | 0.001347 |
| benchmarks/shootout-fib2 | x86_64 | Instantiation | 0.001049 |
| benchmarks/shootout-gimli | x86_64 | Compilation | 0.063835 |
| benchmarks/shootout-gimli | x86_64 | Execution | 0.004247 |
| benchmarks/shootout-gimli | x86_64 | Instantiation | 0.082987 |
| benchmarks/shootout-heapsort | x86_64 | Compilation | -0.032375 |
| benchmarks/shootout-heapsort | x86_64 | Execution | -0.000662 |
| benchmarks/shootout-heapsort | x86_64 | Instantiation | -0.028302 |
| benchmarks/shootout-keccak | x86_64 | Compilation | 0.012330 |
| benchmarks/shootout-keccak | x86_64 | Execution | 0.003932 |
| benchmarks/shootout-keccak | x86_64 | Instantiation | -0.046790 |
| benchmarks/shootout-matrix | x86_64 | Compilation | -0.029146 |
| benchmarks/shootout-matrix | x86_64 | Execution | -0.001596 |
| benchmarks/shootout-matrix | x86_64 | Instantiation | 0.022471 |
| benchmarks/shootout-memmove | x86_64 | Compilation | -0.004290 |
| benchmarks/shootout-memmove | x86_64 | Execution | -0.000002 |
| benchmarks/shootout-memmove | x86_64 | Instantiation | 0.006008 |
| benchmarks/shootout-minicsv | x86_64 | Compilation | 0.096753 |
| benchmarks/shootout-minicsv | x86_64 | Execution | -0.000926 |
| benchmarks/shootout-minicsv | x86_64 | Instantiation | 0.026605 |
| benchmarks/shootout-nestedloop | x86_64 | Compilation | 0.008242 |
| benchmarks/shootout-nestedloop | x86_64 | Execution | 0.017152 |
| benchmarks/shootout-nestedloop | x86_64 | Instantiation | 0.008496 |
| benchmarks/shootout-random | x86_64 | Compilation | -0.017752 |
| benchmarks/shootout-random | x86_64 | Execution | -0.000730 |
| benchmarks/shootout-random | x86_64 | Instantiation | -0.012865 |
| benchmarks/shootout-ratelimit | x86_64 | Compilation | -0.002059 |
| benchmarks/shootout-ratelimit | x86_64 | Execution | 0.002869 |
| benchmarks/shootout-ratelimit | x86_64 | Instantiation | 0.019441 |
| benchmarks/shootout-seqhash | x86_64 | Compilation | 0.030591 |
| benchmarks/shootout-seqhash | x86_64 | Execution | -0.002861 |
| benchmarks/shootout-seqhash | x86_64 | Instantiation | 0.002807 |
| benchmarks/shootout-sieve | x86_64 | Compilation | 0.001555 |
| benchmarks/shootout-sieve | x86_64 | Execution | -0.001753 |
| benchmarks/shootout-sieve | x86_64 | Instantiation | 0.049312 |
| benchmarks/shootout-switch | x86_64 | Compilation | -0.030542 |
| benchmarks/shootout-switch | x86_64 | Execution | -0.003779 |
| benchmarks/shootout-switch | x86_64 | Instantiation | 0.046580 |
| benchmarks/shootout-xblabla20 | x86_64 | Compilation | 0.026218 |
| benchmarks/shootout-xblabla20 | x86_64 | Execution | 0.001472 |
| benchmarks/shootout-xblabla20 | x86_64 | Instantiation | -0.101582 |
| benchmarks/shootout-xchacha20 | x86_64 | Compilation | 0.017953 |
| benchmarks/shootout-xchacha20 | x86_64 | Execution | 0.014455 |
| benchmarks/shootout-xchacha20 | x86_64 | Instantiation | 0.045538 |
| benchmarks/spidermonkey | x86_64 | Compilation | -0.025875 |
| benchmarks/spidermonkey | x86_64 | Execution | -0.000559 |
| benchmarks/spidermonkey | x86_64 | Instantiation | -0.007227 |
Shows pct_change on x64 for the patch if merged compared to current head for main.
Pct_change is based on clocktick event cycles where the benchmarks are run with Sightglass. A negative pct_change means clockticks are expected to be reduced for the benchmark, for that phase, and by that factor, if the patch were merged (i.e. negative is good).
| wasm | arch | phase | pct_change |
|---|---|---|---|
| benchmarks/blake3-scalar | x86_64 | Compilation | -0.050030 |
| benchmarks/blake3-scalar | x86_64 | Execution | 0.010691 |
| benchmarks/blake3-scalar | x86_64 | Instantiation | -0.113561 |
| benchmarks/blake3-simd | x86_64 | Compilation | -0.018848 |
| benchmarks/blake3-simd | x86_64 | Execution | 0.024774 |
| benchmarks/blake3-simd | x86_64 | Instantiation | -0.038650 |
| benchmarks/bz2 | x86_64 | Compilation | -0.063704 |
| benchmarks/bz2 | x86_64 | Execution | -0.021275 |
| benchmarks/bz2 | x86_64 | Instantiation | 0.001148 |
| benchmarks/intgemm-simd | x86_64 | Compilation | -0.019306 |
| benchmarks/intgemm-simd | x86_64 | Execution | -0.006286 |
| benchmarks/intgemm-simd | x86_64 | Instantiation | -0.066674 |
| benchmarks/meshoptimizer | x86_64 | Compilation | -0.026582 |
| benchmarks/meshoptimizer | x86_64 | Execution | 0.000714 |
| benchmarks/meshoptimizer | x86_64 | Instantiation | 0.116023 |
| benchmarks/noop | x86_64 | Compilation | -0.042627 |
| benchmarks/noop | x86_64 | Execution | 0.083770 |
| benchmarks/noop | x86_64 | Instantiation | 0.012709 |
| benchmarks/pulldown-cmark | x86_64 | Compilation | -0.066239 |
| benchmarks/pulldown-cmark | x86_64 | Execution | 0.005557 |
| benchmarks/pulldown-cmark | x86_64 | Instantiation | -0.011457 |
| benchmarks/shootout-ackermann | x86_64 | Compilation | -0.041132 |
| benchmarks/shootout-ackermann | x86_64 | Execution | 0.003190 |
| benchmarks/shootout-ackermann | x86_64 | Instantiation | -0.049969 |
| benchmarks/shootout-base64 | x86_64 | Compilation | 0.013453 |
| benchmarks/shootout-base64 | x86_64 | Execution | 0.003252 |
| benchmarks/shootout-base64 | x86_64 | Instantiation | 0.022696 |
| benchmarks/shootout-ctype | x86_64 | Compilation | -0.037390 |
| benchmarks/shootout-ctype | x86_64 | Execution | 0.000918 |
| benchmarks/shootout-ctype | x86_64 | Instantiation | 0.027234 |
| benchmarks/shootout-ed25519 | x86_64 | Compilation | -0.012902 |
| benchmarks/shootout-ed25519 | x86_64 | Execution | -0.000566 |
| benchmarks/shootout-ed25519 | x86_64 | Instantiation | 0.003339 |
| benchmarks/shootout-fib2 | x86_64 | Compilation | 0.021018 |
| benchmarks/shootout-fib2 | x86_64 | Execution | 0.000560 |
| benchmarks/shootout-fib2 | x86_64 | Instantiation | -0.012342 |
| benchmarks/shootout-gimli | x86_64 | Compilation | 0.049840 |
| benchmarks/shootout-gimli | x86_64 | Execution | 0.021725 |
| benchmarks/shootout-gimli | x86_64 | Instantiation | 0.049550 |
| benchmarks/shootout-heapsort | x86_64 | Compilation | -0.008319 |
| benchmarks/shootout-heapsort | x86_64 | Execution | 0.001982 |
| benchmarks/shootout-heapsort | x86_64 | Instantiation | 0.043471 |
| benchmarks/shootout-keccak | x86_64 | Compilation | 0.011576 |
| benchmarks/shootout-keccak | x86_64 | Execution | 0.008654 |
| benchmarks/shootout-keccak | x86_64 | Instantiation | 0.003275 |
| benchmarks/shootout-matrix | x86_64 | Compilation | -0.062111 |
| benchmarks/shootout-matrix | x86_64 | Execution | 0.011016 |
| benchmarks/shootout-matrix | x86_64 | Instantiation | 0.012293 |
| benchmarks/shootout-memmove | x86_64 | Compilation | -0.045281 |
| benchmarks/shootout-memmove | x86_64 | Execution | 0.001623 |
| benchmarks/shootout-memmove | x86_64 | Instantiation | 0.027841 |
| benchmarks/shootout-minicsv | x86_64 | Compilation | -0.029276 |
| benchmarks/shootout-minicsv | x86_64 | Execution | 0.000702 |
| benchmarks/shootout-minicsv | x86_64 | Instantiation | 0.020804 |
| benchmarks/shootout-nestedloop | x86_64 | Compilation | 0.016071 |
| benchmarks/shootout-nestedloop | x86_64 | Execution | 0.063317 |
| benchmarks/shootout-nestedloop | x86_64 | Instantiation | 0.055417 |
| benchmarks/shootout-random | x86_64 | Compilation | 0.027010 |
| benchmarks/shootout-random | x86_64 | Execution | 0.002454 |
| benchmarks/shootout-random | x86_64 | Instantiation | 0.036361 |
| benchmarks/shootout-ratelimit | x86_64 | Compilation | 0.039516 |
| benchmarks/shootout-ratelimit | x86_64 | Execution | 0.006016 |
| benchmarks/shootout-ratelimit | x86_64 | Instantiation | 0.001117 |
| benchmarks/shootout-seqhash | x86_64 | Compilation | 0.009882 |
| benchmarks/shootout-seqhash | x86_64 | Execution | 0.000350 |
| benchmarks/shootout-seqhash | x86_64 | Instantiation | -0.009995 |
| benchmarks/shootout-sieve | x86_64 | Compilation | -0.005880 |
| benchmarks/shootout-sieve | x86_64 | Execution | 0.001220 |
| benchmarks/shootout-sieve | x86_64 | Instantiation | 0.033825 |
| benchmarks/shootout-switch | x86_64 | Compilation | -0.067979 |
| benchmarks/shootout-switch | x86_64 | Execution | 0.002203 |
| benchmarks/shootout-switch | x86_64 | Instantiation | -0.064441 |
| benchmarks/shootout-xblabla20 | x86_64 | Compilation | -0.049738 |
| benchmarks/shootout-xblabla20 | x86_64 | Execution | 0.000380 |
| benchmarks/shootout-xblabla20 | x86_64 | Instantiation | -0.002152 |
| benchmarks/shootout-xchacha20 | x86_64 | Compilation | -0.005208 |
| benchmarks/shootout-xchacha20 | x86_64 | Execution | 0.005069 |
| benchmarks/shootout-xchacha20 | x86_64 | Instantiation | -0.004924 |
| benchmarks/spidermonkey | x86_64 | Compilation | -0.021401 |
| benchmarks/spidermonkey | x86_64 | Execution | 0.002470 |
| benchmarks/spidermonkey | x86_64 | Instantiation | -0.004928 |
@jlb6740 a question for you: does pct_change indicate a percent, as the column name implies, so e.g. 0.002470 means 0.00247% (i.e., 24.7 * 10^-6)? Or is it a raw ratio, so 0.247%?
A few other comments on report formatting:
- It would be nice to sort overall by phase, so we have all the instantiation numbers together, and compilation together, and execution together. Generally if I'm testing a PR that e.g. updates the way instantiation is done, I want to look just at those numbers; I don't want to have to visually skip to every third row.
- Could we have a mean (geomean probably, rather than arithmetic) for each category?
These would make the report a little easier to grok at a glance than the sea of numbers we currently have!
@jlb6740 a question for you: does
pct_changeindicate a percent, as the column name implies, so e.g.0.002470means 0.00247% (i.e., 24.7 * 10^-6)? Or is it a raw ratio, so 0.247%?
Pct_change is being calculated by pandas: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.pct_change.html See the examples there. Pct can be confusing because you don't know if you are supposed to factor in the 100x or not but here pandas is printing out what is really a factor. Maybe we should call it factor change to avoid confusion even though they call it pct_change.
- Note, I cut down on the number of runs so there might be more variability here that we can reduce at the expense of time to execute. Currently we have:
--processes 1 \ --iterations-per-process 2 \
while the defaults are 10 process and 10 iterations per process.
A few other comments on report formatting:
- It would be nice to sort overall by phase, so we have all the instantiation numbers together, and compilation together, and execution together. Generally if I'm testing a PR that e.g. updates the way instantiation is done, I want to look just at those numbers; I don't want to have to visually skip to every third row.
Yeah, I agree. .. will be easy to do.
- Could we have a mean (geomean probably, rather than arithmetic) for each category?
These would make the report a little easier to grok at a glance than the sea of numbers we currently have!
Yes .. that is a good idea. I have some concepts in an internal benchmark that we have called wasmbench that is based on sightglass. It does things like calculate geomean as you describe and a ratio to native performance to produce a final score. Eventually I'd like to donate these to sightglass potentially a separate benchmark to sightglass but I am digressing .. yes I agree we should apply geomeans here!
Maybe we should call it factor change to avoid confusion even though they call it pct_change.
Yes, I'd favor that; or multiply all the numbers by 100 so it truly is a percentage. I would argue that calling this value as-is pct_change is a bug: percent literally means "per hundred" and so if it's missing that factor of a hundred, it is incorrect.
Given that, I'm now seeing these numbers in a new light. I see a bunch of variation on the order of 3-5%... for a PR with no changes. That's a very concerning level of variability: if our PRs sometimes make optimizations with effects on the order of 1% or less (but stable, measurable effects), this variability will simply swamp any such change and so will make the measurements unusable.
In order to trust these numbers to evaluate changes, I'd want to see a much lower noise floor -- something like +/- 0.1% at most. Or else we should filter any results out with a delta less than our known variability (i.e., not statistically significant). Could we try turning up the iteration count and see how that affects things?
Yes, agreed we should do both .. crank up processes and iterations and print geo means.
|Test|Test| |1|1|
| Test | Test |
|---|---|
| 1 | 1 |
| Test | Test |
|---|---|
| 1 | 1 |
| Test2 | Test2 |
|---|---|
| 1 | 1 |
⬆⬇