hitimes icon indicating copy to clipboard operation
hitimes copied to clipboard

Remove dependency on Process.clock_getres

Open copiousfreetime opened this issue 3 years ago • 6 comments
trafficstars

copiousfreetime avatar Apr 18 '22 02:04 copiousfreetime

Closes #76

copiousfreetime avatar Apr 18 '22 02:04 copiousfreetime

Any opinions on the somewhat hacky mechanism to determine best resolution clock @eregon?

copiousfreetime avatar Apr 18 '22 02:04 copiousfreetime

That sounds a bit better, but still rather fragile. It also sounds like a non-trivial overhead on the first usage. I think being a non-deterministic is a big issue here, for instance it could lead to using one clock for one process run, and another for the next run on the same machine. That's problematic for benchmarking, especially if one is CLOCK_MONOTONIC_RAW and the other CLOCK_MONOTONIC, because they don't have the same definition of a second.

IMHO hardcoding per platform would be the best here, the worst that could happen is to get a slightly less precise clock on an uncommon OS, IMHO not a big deal and easy enough to fix. So I think starting with just macos ? CLOCK_MONOTONIC_RAW : CLOCK_MONOTONIC would be the best.

Or simplify even further and just always use CLOCK_MONOTONIC, because I believe that's the clock everyone should use for benchmarking and accurate time measurements. CLOCK_MONOTONIC_RAW doesn't actually represent seconds and so could be quite off from a real second in some rare but possible cases (IIRC, because the Linux man page doesn't make this very explicit).

eregon avatar Apr 18 '22 12:04 eregon

Thanks for the feedback @eregon I'm not feeling particularly good about the 'pick the clock on require' approach. I think I'm going to do some testing on as many platforms as I can find and see what comes out. I'll probably go with the hardcode-by-platform approach, with fallback.

copiousfreetime avatar Apr 20 '22 16:04 copiousfreetime

Yes, I think that ultimately that is the best solution, even though it's clearly not the most convenient. Also the "worst case" is a slightly less precise clock, which seems not too bad, compared to other approaches.

I think CLOCK_MONOTONIC is even precise to microseconds on most platforms, and that's probably more than good enough. I would think extremely few use cases actually need nanosecond precision.

eregon avatar Apr 20 '22 16:04 eregon

FWIW for monotime I just went with CLOCK_UPTIME_RAW for macOS and CLOCK_MONOTONIC for everything else.

I only just justified the former on the basis of:

  1. It benchmarked faster on two different architectures, albeit only 10-25%
  2. It's higher resolution, which I don't think matters much for Ruby but, hey
  3. Rust did it too, and that's what I'm mostly emulating anyway :P

As for your attempt to detect resolution by counting zero bits, I tried that myself out of interest -- here's CLOCK_MONOTONIC_FAST, 1,000 runs of majority-of 10,000 samples:

=> {"1ns"=>813, "2ns"=>143, "4ns"=>32, "8ns"=>11, "16ns"=>1}

And sure - it has 1ns resolution, but here's what you get if you bucket 10,000 back-to-back calls:

{562332641125037 => 83,
 562332642124615 => 704,
 562332643124423 => 704,
 562332644124251 => 735,
 562332645124119 => 743,
 562332646124088 => 824,
 562332647124126 => 749,
 562332648124025 => 736,
 562332649123923 => 734,
 562332650124002 => 772,
 562332651123950 => 789,
 562332652123989 => 741,
 562332653124108 => 746,
 562332654123996 => 821,
 562332655123944 => 119}

It only updates about once per millisecond. In contrast 10,000 back-to-back CLOCK_MONOTONIC calls gives me 10,000 unique times, an average of 390ns apart.

Also, this:

# Sort them by the resolution - we want the smallest one first
      ids_and_resolutions.sort_by! { |pair| pair[1] }
      best_clock_and_resolution = ids_and_resolutions[0]

To quote Ruby docs: "For duplicates returned by the block, the ordering is indeterminate, and may be unstable".

If you had multiple clocks with 1ns resolution you can end up selecting a different one each time. If you were to keep this approach it might be wise to include something like an array index (with the options sorted by general preference) to act as a tie-breaker. Like so.

I decided against it.

Freaky avatar Sep 23 '23 04:09 Freaky