BenchmarkTools.jl icon indicating copy to clipboard operation
BenchmarkTools.jl copied to clipboard

stop printing "humanized" time units

Open StefanKarpinski opened this issue 5 years ago • 21 comments

On Slack today, some timings were posted:

julia> @btime sum(a) > 0
  26.323 μs (1 allocation: 16 bytes)
false

julia> @btime any(a .> 0)
  116.147 μs (5 allocations: 16.64 KiB)
true

julia> @btime any(!iszero, a)
  178.574 ns (0 allocations: 0 bytes)
true

The entire ensuing conversation was predicated on the fact that any(!iszero, a) was mysteriously slower than both sum(a) > 0 and any(a .> 0). Nobody noticed that the last time measurement is in nanoseconds while the first two are in microseconds. I've seen this happen quite often. It's happened to me. The reason I knew to be alert to this is because I've missed the very subtle difference between "ms", "μs" and "ns" before. Moreover, this doesn't match how Base prints times anymore:

julia> @time sleep(1)
  1.047126 seconds (114.42 k allocations: 6.038 MiB, 0.90% gc time)

julia> @time sleep(0.001)
  0.006039 seconds (82 allocations: 5.375 KiB)

In fact, Base's time printing was changed for the exact same reason! I propose that we choose one of the following ways to print times instead:

  0.000026323 seconds (1 allocation: 16 bytes)
  0.000116147 seconds (5 allocations: 16.64 KiB)
  0.000000179 seconds (0 allocations: 0 bytes)

In other words, with a fixed 9 decimal digits and room for up to three leading digits. That's plenty of digits and it's very clear what scale values are on. If the values must be scaled somehow, then consider using engineering notation instead, so something like this:

  26.323e-6 seconds (1 allocation: 16 bytes)
 116.147e-6 seconds (5 allocations: 16.64 KiB)
 179.000e-9 seconds (0 allocations: 0 bytes)

But note that this is not much shorter and far less clear than just printing 9 digits after the decimal.

StefanKarpinski avatar Oct 26 '18 00:10 StefanKarpinski

Better still would be do as above by default and allow the reported timing unit to be specified as a BenchmarkTools parameter.

# default
0.000_026_323 seconds (1 allocation: 16 bytes)
0.000_116_147 seconds (5 allocations: 16.64 KiB)
0.000_000_179 seconds (0 allocations: 0 bytes)

# BenchmarkTools.DEFAULT_PARAMETERS.time_unit = "millisecond"
0.026_323 milliseconds (1 allocation: 16 bytes)
0.116_147 milliseconds (5 allocations: 16.64 KiB)
0.000_179 milliseconds (0 allocations: 0 bytes)

# BenchmarkTools.DEFAULT_PARAMETERS.time_unit = "microsecond"
26.323 microseconds (1 allocation: 16 bytes)
116.147 microseconds (5 allocations: 16.64 KiB)
0.179 microseconds (0 allocations: 0 bytes)

# BenchmarkTools.DEFAULT_PARAMETERS.time_unit = "ms"
0.026_323 ms (1 allocation: 16 bytes)
0.116_147 ms (5 allocations: 16.64 KiB)
0.000_179 ms (0 allocations: 0 bytes)

# BenchmarkTools.DEFAULT_PARAMETERS.time_unit = "μs"
26.323 μs (1 allocation: 16 bytes)
116.147 μs (5 allocations: 16.64 KiB)
0.179 μs (0 allocations: 0 bytes)

# BenchmarkTools.DEFAULT_PARAMETERS.time_unit = "ns"
26_323.0 ns (1 allocation: 16 bytes)
116_147.0 ns (5 allocations: 16.64 KiB)
179.0 ns (0 allocations: 0 bytes)

or, if the unit is specified, then use the s ms etc formats.

JeffreySarnoff avatar Oct 26 '18 03:10 JeffreySarnoff

I appreciate the current behaviour. I find it hard to read long decimal numbers and scientific notation. If the default is changed I'd like the ability to return to the current behaviour as a global setting.

In the Slack discussion you posted, someone (Alex) pointed out the correct interpretation within 4 posts / 4 minutes of the benchmark results post. I don't think there was much time lost for misinterpretation.

iamed2 avatar Oct 26 '18 03:10 iamed2

I'm curious how "ms", "μs" and "ns" are clearer to you than engineering notation or a fixed point decimal. In my mind, I have to think "micro"... right, that means 10^-6 before I know what it means. And confusing "milli" and "micro" is a perennial danger. Would this format be clearer?

  26.323 × 1/10^6 seconds (1 allocation: 16 bytes)
 116.147 × 1/10^6 seconds (5 allocations: 16.64 KiB)
 179.000 × 1/10^9 seconds (0 allocations: 0 bytes)

I find that that still makes it very easy to glance at this and think that the last one is slowest. What about putting separators in the fixed point version:

  0.000_026_323 seconds (1 allocation: 16 bytes)
  0.000_116_147 seconds (5 allocations: 16.64 KiB)
  0.000_000_179 seconds (0 allocations: 0 bytes)

StefanKarpinski avatar Oct 26 '18 04:10 StefanKarpinski

putting separators in the fixed point version helps

JeffreySarnoff avatar Oct 26 '18 04:10 JeffreySarnoff

I've just memorized: ns is the shortest, ms is the longest, and μs is in the middle.

Yeah the separators do help. I think that plus nanoseconds as the default unit would be good.

iamed2 avatar Oct 26 '18 04:10 iamed2

I like the natural units because I can pronounce them to myself and they make sense to me - "ah, microseconds, I know what those are", whereas as a non-engineer I tend to always just glaze over a difference between 10^-6 and 10^-9. I have to stop myself and think "10^-6s wait, ah, that's a millionth of a second, so a microsecond". So I think the "easiest" format is very person- (and field-) dependent. It is also likely that one's degree of familiarity with SI units plays a role.

If we were to do this, I'd suggest adopting a smaller unit than seconds. Microseconds (with three decimal places to get the nano seconds) would give most actual benchmarks a number of a size that's easy to talk about.

mkborregaard avatar Oct 26 '18 06:10 mkborregaard

What about putting separators in the fixed point version:

I don't really have much of an opinion either way vis-a-vis actually changing the printing, but I do like this fixed point version quite a bit

jrevels avatar Oct 26 '18 20:10 jrevels

Seconds is too big of a unit to default to.

Just print nanoseconds by default and have an option to print in another unit.

KristofferC avatar Oct 30 '18 13:10 KristofferC

Ok, so the nanosecond version would be something this:

   26_323 nanoseconds (1 allocation: 16 bytes)
  116_147 nanoseconds (5 allocations: 16.64 KiB)
      179 nanoseconds (0 allocations: 0 bytes)

A good alignment rule is a bit hard to come up with for this.

StefanKarpinski avatar Oct 30 '18 14:10 StefanKarpinski

Looks good like that, though. Might still consider microseconds, as that would be easier to talk about ("half a microsecond" or "26 microseconds").

mkborregaard avatar Oct 30 '18 14:10 mkborregaard

Microseconds version:

   26.323 µs (1 allocation: 16 bytes)
  116.147 µs (5 allocations: 16.64 KiB)
    0.179 µs (0 allocations: 0 bytes)

StefanKarpinski avatar Oct 30 '18 14:10 StefanKarpinski

I think nanoseconds are better because an operation takes on the order of nanoseconds.

I think we can print ns instead of the full thing, especially when it is the default. Not sure if aligning is needed if one just prints a single benchmark result. If one prints a table I feel there should be a way of specifying in BenchmarkTools.jl a way to run a function over a collection and then it can be aligned.

KristofferC avatar Oct 30 '18 14:10 KristofferC

I think that printing a leading space and then aligning at the microsecond, millisecond, or second place would end up looking pretty good. This would end up looking like this then:

  26_323 ns (1 allocation: 16 bytes)
 116_147 ns (5 allocations: 16.64 KiB)
 179 ns (0 allocations: 0 bytes)

StefanKarpinski avatar Oct 30 '18 14:10 StefanKarpinski

Google benchmark prints ns https://github.com/google/benchmark#counter-reporting.

------------------------------------------------------------------------------
Benchmark                        Time           CPU Iterations UserCounters...
------------------------------------------------------------------------------
BM_UserCounter/threads:8      2248 ns      10277 ns      68808 Bar=16 Bat=40 Baz=24 Foo=8
BM_UserCounter/threads:1      9797 ns       9788 ns      71523 Bar=2 Bat=5 Baz=3 Foo=1024m

Cargo benchmark seems to print ns (not sure if it changes based on the absolute numbers)

test tests::bench_add_two ... bench:         1 ns/iter (+/- 0)

Another rust benchmark framework prints (https://github.com/japaric/criterion.rs) prints something like

[26.029 us 26.251 us 26.505 us]

Go benchmark prints in ns

BenchmarkFib1   1000000000               2.84 ns/op
BenchmarkFib2   500000000                7.92 ns/op
BenchmarkFib3   100000000               13.0 ns/op
BenchmarkFib10   5000000               447 ns/op
BenchmarkFib20     50000             55668 ns/op

KristofferC avatar Oct 30 '18 14:10 KristofferC

seconds are too coarse when measuring differences in nanoseconds nanoseconds are too fine when measuring differences in seconds

On Tue, Oct 30, 2018 at 10:50 AM Kristoffer Carlsson < [email protected]> wrote:

Google benchmark prints ns https://github.com/google/benchmark#counter-reporting.


Benchmark Time CPU Iterations UserCounters...

BM_UserCounter/threads:8 2248 ns 10277 ns 68808 Bar=16 Bat=40 Baz=24 Foo=8 BM_UserCounter/threads:1 9797 ns 9788 ns 71523 Bar=2 Bat=5 Baz=3 Foo=1024m

Cargo benchmark seems to print ns (not sure if it changes based on the absolute numbers)

test tests::bench_add_two ... bench: 1 ns/iter (+/- 0)

Another rust benchmark framework prints ( https://github.com/japaric/criterion.rs) prints something like

[26.029 us 26.251 us 26.505 us]

Go benchmark prints in ns

BenchmarkFib1 1000000000 2.84 ns/op BenchmarkFib2 500000000 7.92 ns/op BenchmarkFib3 100000000 13.0 ns/op BenchmarkFib10 5000000 447 ns/op BenchmarkFib20 50000 55668 ns/op

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JuliaCI/BenchmarkTools.jl/issues/125#issuecomment-434330917, or mute the thread https://github.com/notifications/unsubscribe-auth/ABmqxvJlkpt-0FYU9BSIGwQVm2UhmHOTks5uqGc4gaJpZM4X7WWb .

JeffreySarnoff avatar Oct 30 '18 17:10 JeffreySarnoff

We can't really measure anything finer grained than nanoseconds, so reporting finer than that is kind of silly. Even measuring things that are a few nanoseconds is pretty questionable.

StefanKarpinski avatar Oct 30 '18 17:10 StefanKarpinski

Now that I think about it, I agree with @KristofferC that we should pick nanoseconds as our "display unit", though I don't care so much about "order of magnitude" arguments.

Our display unit should be the nanosecond because that's the unit BenchmarkTools uses for all timing measurements/estimates, and IIRC is the unit of all such values returned via the BenchmarkTools API. Having a mismatch between the display and API units would be needlessly confusing, and not worth any other potential benefit I can think of.

I think that printing a leading space and then aligning at the microsecond, millisecond, or second place would end up looking pretty good. This would end up looking like this then:

  26_323 ns (1 allocation: 16 bytes)
 116_147 ns (5 allocations: 16.64 KiB)
 179 ns (0 allocations: 0 bytes)

I would rather print leading zeros up through the hundred seconds place, e.g.:

000_000_026_323 ns (1 allocation: 16 bytes)
000_000_116_147 ns (5 allocations: 16.64 KiB)
000_000_000_179 ns (0 allocations: 0 bytes)

That's a lot of zeros though 😛

jrevels avatar Oct 30 '18 19:10 jrevels

Also of note that on certain platforms (Windows) timer resolution is about 100ns (at least last time I checked that is what we were using https://github.com/JuliaCI/BenchmarkTools.jl/pull/94#issuecomment-359639945)

vchuravy avatar Oct 30 '18 19:10 vchuravy

I would rather print leading zeros up through the hundred seconds place, e.g.:

Plz no. I would say 90-95% of the benchmarks I do are under 1000 ns. Having to look at that many zeros all the time would make me very sad.

KristofferC avatar Oct 30 '18 19:10 KristofferC

Plz no. I would say 90-95% of the benchmarks I do are between 0-999 ns. Having to look at that many zeros all the time would make me very sad.

Yeah, I feel ya. Many of the benchmarks I look at are in microsecond range (with the goal to get them to the nanosecond range 😛).

That leading zeros nanosecond format means most timings will fit into that space (idk if people are using BenchmarkTools for 999+ second benchmarks) so everything aligns nicely + order of magnitude comparisons are easy to get at a glance.

The same order-of-magnitude-at-a-glance thing is technically true for @StefanKarpinski's example above, but the alignment confuses my brain for some reason...

jrevels avatar Oct 30 '18 19:10 jrevels

(repeating earlier advocacy) BenchmarkTools.DEFAULT_PARAMETERS.time_unit

JeffreySarnoff avatar Oct 30 '18 21:10 JeffreySarnoff