Use hardware performance counter data for the detailed/self-profile data view
Now that rustc supports using HPC data in -Zself-profile (rust-lang/rust#78781), it would be great to use this support on perf.rlo as well. Many of our smaller benchmarks don't run long enough for the std::time::Instant based profiling to work reliably which makes it hard to interpret the data when it doesn't really match the results reported on the summary page for a particular benchmark. By using the HPC data, hopefully this will improve the accuracy of detailed data view.
instructions:u is also the default for the total counts, so it seems natural to compare it, instead of time - often there is a meaningful change in the total instructions:u, but it's lost in time noise in the query view.
Also, assuming it (still) works, I recommend -Z self-profile-counter=instructions-minus-irqs:u (instead of the plain instructions:u), to avoid weird jitter in otherwise-deterministic queries.
Oh dear, it won't work yet, it's broken without adding features = ["nightly"] or updating the version of measureme that rustc uses (see https://github.com/rust-lang/rust/pull/78781#issuecomment-1165931887).
Opened https://github.com/rust-lang/rust/pull/98471 to update measureme in rustc to resolve that.
I have tried to implement this, but I don't know how to actually read the HW counter data from the output of -Zself-profile. I'm using -Zself-profile -Zself-profile-counter=instructions:u, but the output file, when processed with summarize summarize --json, only contains information about time (I don't see any counter values).
I think profiles either contain timestamps or instructions:u values, not both. Did you check that summarize summarize --json outputs values that make sense if the counters are time rather than instructions?
To be honest, I'm not really sure how to recognize that. It's true that when I turn on HW counters, the times seem to be diferent by orders of magnitude. So the counter values just get stored in the nanos attribute of time?
The PR introducing the feature in rustc did mention this as somewhat backwards compatible for tools, until they adapt to the new counters. Maybe summarize is in that category.
Yeah I saw that, but somehow I expected that this adoption has already happened in these 3 years 😅 Maybe not, I'll check how the tools work.
Yeah it seems to just output time as nanoseconds.
Reopening, because https://github.com/rust-lang/rustc-perf/pull/1647 had to be reverted because of some issue with measureme.