iai icon indicating copy to clipboard operation
iai copied to clipboard

Use hardware performance counters instead of cachegrind

Open asomers opened this issue 4 years ago • 3 comments

Iai is very exciting! I love the idea of benchmarks that are fast and deterministic. But relying on Cachegrind has some drawbacks:

  • Limited OS support
  • Requires the user to install valgrind
  • Executing binaries is slow
  • Valgrind alters the program's normal execution. This reduces its accuracy, and leads to bugs like #8

Modern CPUs contain hardware performance counters that can be used for nearly zero-cost profiling. Using those instead of Iai would have several benefits:

  • No dependency on Valgrind
  • Much faster to execute
  • The counters can be paused and restarted mid-process. This would allow Iai to skip setup and teardown sections as requested in #7 .
  • Wider OS support
  • More accurate and detailed reports.

On FreeBSD, pmc(3) provides access to the counters, and there is already a nascent Rust crate for them: pmc-rs. On Linux, I think the perfcnt and perf crates provide the same functionality.

asomers avatar Feb 03 '21 05:02 asomers

I think that https://github.com/jbreitbart/criterion-perf-events is an attempt to do that.

shepmaster avatar Feb 14 '21 03:02 shepmaster

cool! Thanks for the tip.

asomers avatar Feb 14 '21 03:02 asomers

Yes, if that's what you want I would recommend using the criterion-perf-events plugin.

Cachegrind is used specifically for its emulation of the memory hierarchy. Because we can control the parameters of that emulation, Iai can take measurements under cachegrind that should be far more repeatable and consistent between machines than are possible even with performance counters. Hardware performance counters will naturally be different between different hardware.

In addition, under virtualization it's common for access to the performance counters of the underlying hardware to be disabled, so it's not as if that approach is without drawback either. I know this is the case, because the VM I do my work in at my day job has its performance counters disabled for mysterious IT-department reasons.

bheisler avatar Feb 25 '21 22:02 bheisler