BenchmarkTools.jl icon indicating copy to clipboard operation
BenchmarkTools.jl copied to clipboard

`@btime` errors because `tune!` does not execute setup

Open SebastianAment opened this issue 4 years ago • 4 comments

It appears tune! does not execute setup before every run, leading to errors in certain cases. See below for a MWE:

using BenchmarkTools

function f!(x::AbstractVector)
    length(x) == 2 || error("setup not correctly executed")
    push!(x, randn())
end

Then @benchmarkable works:

b = @benchmarkable f!(y) setup=(y=randn(2))
run(b) # works

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):   66.000 ns …  32.197 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):      79.000 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   113.119 ns ± 446.383 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ██▄▅▄▃▃▄▄▄▅▅▅▃▂▁                                              ▂
  ███████████████████████████▆▇▇▆▇▆▆▅▄▆▆▅▇▄▅▅▅▄▅▄▅▄▄▆▅▅▅▅▄▄▄▄▃▄ █
  66 ns         Histogram: log(frequency) by time        374 ns <

 Memory estimate: 48 bytes, allocs estimate: 1.

But neither @btime nor @benchmark do:

@btime f!(y) setup=(y=randn(2)) # errors
ERROR: setup not correctly executed
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:33
  [2] f!(x::Vector{Float64})
    @ Main ~/Documents/SEA/UpdatableCholeskyFactorizations/doodles/benchmarktools_bug.jl:4
  [3] var"##core#593"(y::Vector{Float64})
    @ Main ~/.julia/packages/BenchmarkTools/uq9zP/src/execution.jl:479
  [4] var"##sample#594"(__params::BenchmarkTools.Parameters)
    @ Main ~/.julia/packages/BenchmarkTools/uq9zP/src/execution.jl:487
  [5] _lineartrial(b::BenchmarkTools.Benchmark, p::BenchmarkTools.Parameters; maxevals::Int64, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ BenchmarkTools ~/.julia/packages/BenchmarkTools/uq9zP/src/execution.jl:160
  [6] _lineartrial(b::BenchmarkTools.Benchmark, p::BenchmarkTools.Parameters)
    @ BenchmarkTools ~/.julia/packages/BenchmarkTools/uq9zP/src/execution.jl:152
  [7] #invokelatest#2
    @ ./essentials.jl:708 [inlined]
  [8] invokelatest
    @ ./essentials.jl:706 [inlined]
  [9] #lineartrial#46
    @ ~/.julia/packages/BenchmarkTools/uq9zP/src/execution.jl:34 [inlined]
 [10] lineartrial
    @ ~/.julia/packages/BenchmarkTools/uq9zP/src/execution.jl:34 [inlined]
 [11] tune!(b::BenchmarkTools.Benchmark, p::BenchmarkTools.Parameters; progressid::Nothing, nleaves::Float64, ndone::Float64, verbose::Bool, pad::String, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ BenchmarkTools ~/.julia/packages/BenchmarkTools/uq9zP/src/execution.jl:250
 [12] tune! (repeats 2 times)
    @ ~/.julia/packages/BenchmarkTools/uq9zP/src/execution.jl:249 [inlined]
 [13] top-level scope
    @ ~/.julia/packages/BenchmarkTools/uq9zP/src/execution.jl:566

@benchmark yields a similar stack trace, leading me to believe that tune! does not call setup in subsequent runs.

SebastianAment avatar Nov 16 '21 22:11 SebastianAment

I think you just want to specify evals=1, otherwise this is evaluated several times per setup.

julia> function f!(x::AbstractVector)
           # length(x) == 2 || error("setup not correctly executed")
           sleep(length(x)/10)  # 100ms per element
           push!(x, randn())
       end;

julia> @btime f!(y) setup=(y=randn(2);) evals=1;
  min 206.132 ms, mean 206.175 ms (6 allocations, 224 bytes)

julia> @btime f!(y) setup=(y=randn(2);) evals=10;
  min 655.086 ms, mean 655.086 ms (5 allocations, 185 bytes)

julia> mean(2:11)
6.5

mcabbott avatar Nov 24 '21 02:11 mcabbott

I've also run into an issue caused by this behavior. Managed to figure out the evals=1 solution, but it was a pretty opaque bug to track down. It can't be that uncommon to benchmark functions that destroy a required input property---perhaps it would be good to mention how to deal with them in the README Quick Start?

My example:

using LinearAlgebra

function randposdef(N)
    A = randn(100, 100)
    return Symmetric(A * A' + I)
end

julia> @btime cholesky!(A) setup=(A = randposdef(100));  # evals=1 needed to make it work
ERROR: PosDefException: matrix is not positive definite; Cholesky factorization failed.
Stacktrace:
  [1] checkpositivedefinite
    @ ~/lib/julia-1.7.2/share/julia/stdlib/v1.7/LinearAlgebra/src/factorization.jl:18 [inlined]
  [2] cholesky!(A::Symmetric{Float64, Matrix{Float64}}, ::Val{false}; check::Bool)
    @ LinearAlgebra ~/lib/julia-1.7.2/share/julia/stdlib/v1.7/LinearAlgebra/src/cholesky.jl:266
  [3] cholesky! (repeats 2 times)
    @ ~/lib/julia-1.7.2/share/julia/stdlib/v1.7/LinearAlgebra/src/cholesky.jl:265 [inlined]
  [4] var"##core#423"(A::Symmetric{Float64, Matrix{Float64}})
    @ Main ~/.julia/packages/BenchmarkTools/7xSXH/src/execution.jl:489
  [5] var"##sample#424"(::Tuple{}, __params::BenchmarkTools.Parameters)
    @ Main ~/.julia/packages/BenchmarkTools/7xSXH/src/execution.jl:497
  [6] _lineartrial(b::BenchmarkTools.Benchmark, p::BenchmarkTools.Parameters; maxevals::Int64, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ BenchmarkTools ~/.julia/packages/BenchmarkTools/7xSXH/src/execution.jl:161
  [7] _lineartrial(b::BenchmarkTools.Benchmark, p::BenchmarkTools.Parameters)
    @ BenchmarkTools ~/.julia/packages/BenchmarkTools/7xSXH/src/execution.jl:153
  [8] #invokelatest#2
    @ ./essentials.jl:716 [inlined]
  [9] invokelatest
    @ ./essentials.jl:714 [inlined]
 [10] #lineartrial#46
    @ ~/.julia/packages/BenchmarkTools/7xSXH/src/execution.jl:35 [inlined]
 [11] lineartrial
    @ ~/.julia/packages/BenchmarkTools/7xSXH/src/execution.jl:35 [inlined]
 [12] tune!(b::BenchmarkTools.Benchmark, p::BenchmarkTools.Parameters; progressid::Nothing, nleaves::Float64, ndone::Float64, verbose::Bool, pad::String, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ BenchmarkTools ~/.julia/packages/BenchmarkTools/7xSXH/src/execution.jl:251
 [13] tune! (repeats 2 times)
    @ ~/.julia/packages/BenchmarkTools/7xSXH/src/execution.jl:250 [inlined]
 [14] top-level scope
    @ ~/.julia/packages/BenchmarkTools/7xSXH/src/execution.jl:576

danielwe avatar Apr 23 '22 21:04 danielwe

Actually, if you provided a setup you probably wanted to collect statistics for that particular specification, so even in cases where the benchmarking doesn't error out it could be flawed if it involves mutation and uses evals != 1. So maybe the README and docs should recommend evals=1 for any benchmark that mutates?

danielwe avatar Apr 23 '22 21:04 danielwe

My mental model of what's happening inside @btime or @benchmark (As in @benchmarkable one could manually skip tuning):

  1. The test is set.
  2. Tuning happens. In tuning we sample once and do many evaluations to set the number of evaluations optimally. Since we sample once, the setup stage happens only once.
  3. We run the benchmark.

So if we set evals manually is makes the tuning stage meaningless, hence it is skipped which solves the issue. In the case above, since the data is overwritten we need to apply setup per each evaluation hence we set evals=1 which means we skip tuning and in the running we do sampling and a single evaluation.

Could any developer verify that? It might be useful to write in documentation.

RoyiAvital avatar May 28 '22 12:05 RoyiAvital