rust-playground
rust-playground copied to clipboard
Add support for cargo bench
It would be great to support benchmark tests through the command cargo bench available on nightly. Possible use cases include making it easier for others to help optimize code snippets.
Benchmarking on a cloud service provider is inherently a flaky proposition. There's no guarantee that the underlying CPU will be at a constant speed during a single benchmark run, much less between two benchmarks run sequentially or two completely separate executions.
That's true. While discrepancy between separate benchmarks doesn't matter for the intended usage, variance in proportional performance sucks. Could running the generated WebAssembly of the benchmark on the client be a viable option? I do recognize that it is infinitely more work.
running the generated WebAssembly of the benchmark on the client
That sounds like #374. It's certainly plausible, although I don't know enough of the particulars around what is needed (high-resolution timesource, for example) to know what availability there is.
discrepancy between separate benchmarks doesn't matter for the intended usage
Wouldn't you expect to see things like
#[bench] slow_fn() { /* ... */ }
#[bench] fast_fn() { /* ... */ }
And then people would get surprised when fast_fn is slower than slow_fn? To me, benchmarking is inherently a comparative endeavor: "did it get faster or slower than what it was".
And then people would get surprised when
fast_fnis slower thanslow_fn? To me, benchmarking is inherently a comparative endeavor: "did it get faster or slower than what it was".
Yes, sorry I wasn't clear enough. I meant that it's important that the percentages stay the same but the actual times may differ.
That sounds like #374.
Definitely, provided that the test crate gets support.
Just as an idea, profiling might be more stable than running the benchmarks, but provide similar feedback. Assuming execution time and things like cache hits are unstable, the total number of assembly instructions executed for the program is a very useful metric.
I am not sure what exactly the backend to this website does, but valgrind --tool=callgrind --dump-instr=yes --collect-jumps=yes --simulate-cache=yes ./target/release/... runs the executable and collects this information. It could be parsed used to annotate the assembly display. Execution times would be somewhat meaningless (as with benches), but with few exceptions, a program with fewer total executed instructions will be slower than one with more. And you can easily see where operations are concentrated.
perf can do something similar but it's kind of a huge pain to get working right compared to valgrind, imo.