Chairmarks.jl icon indicating copy to clipboard operation
Chairmarks.jl copied to clipboard

Add a method to `Statistics.std` for Benchmarks

Open asinghvi17 opened this issue 1 year ago • 4 comments

Pretty simple change, just what it says on the tin.

asinghvi17 avatar Apr 25 '24 15:04 asinghvi17

Out of curiosity, what's the usecase?

I'm happy to pass through the corrected keyword argument, but the mean keyword argument, as written, must be a real number which doesn't make sense. I would prefer to just not accept the mean keyword argument, though it's also possible to take a Sample as a mean, but that would require special handling.

Also needs tests.

LilithHafner avatar Apr 25 '24 16:04 LilithHafner

The primary reason I haven't merged this as-is is that it will have strange behavior when displaying/interpreting a sample. Even more strange than computing mean or median on a benchmark.

julia> x = @be 1+1
Benchmark: 6326 samples with 10735 evaluations
min    1.133 ns
median 1.292 ns
mean   1.381 ns
max    6.979 ns

julia> Chairmarks.elementwise(std, x)
0.271 ns (without a warmup)

LilithHafner avatar Sep 13 '24 18:09 LilithHafner

I didn't realize it looks so strange when displayed!

My primary usecase was visualizing error bars of benchmarks, where the standard deviation defines the height of the error bar. BenchmarkTools already has this particular integration, and because you can feed Chairmarks benchmarks into BenchmarkGroups, it would be very easy for the plotting end of my benchmarks to be agnostic between Chairmarks and BenchmarkTools.

julia> using BenchmarkTools

julia> be = @benchmark randn(1000)
using SBenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min … max):  1.692 μs …  4.269 ms  ┊ GC (min … max):  0.00% … 99.87%
 Time  (median):     2.388 μs              ┊ GC (median):     0.00%
 Time  (mean ± σ):   3.446 μs ± 44.716 μs  ┊ GC (mean ± σ):  23.29% ±  3.28%

  t      ▅▇█▅▁
  ▂▄▃▃▃▆█████▇▅▄▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂ ▃a
  1.69 μs        Histogram: frequency by time        6.43 μs <

 Memory estimate: 8.00 KiB, allocs estimate: 1.

julia> using Statistics

julia> std(be)
BenchmarkTools.TrialEstimate:
  time:             44.716 μs
  gctime:           44.579 μs (99.69%)
  memory:           8.00 KiB
  allocs:           1

That being said, I should update this and add some tests for sure!

asinghvi17 avatar Sep 13 '24 18:09 asinghvi17

It looks like BenchmarkTools is using std for time and gctime and something line only(unique(x)) for allocs and memory. I imagine Chairmarks should do something similar, at least for fields such as warmup if we were to support std.

LilithHafner avatar Sep 13 '24 19:09 LilithHafner

Feel free to re-open if you want to pick this back up

LilithHafner avatar Nov 23 '24 14:11 LilithHafner