E cores?
At first glance, it looks like the E cores on newer Intel CPUs don't have perf counters. Can/should rr automatically pin itself to only run on cores that are going to work?
dave@intel:~/obj$ perf stat -ddd /bin/date
Mon Jul 25 01:11:14 AM CEST 2022
Performance counter stats for '/bin/date':
0.17 msec task-clock # 0.539 CPUs utilized
0 context-switches # 0.000 /sec
0 cpu-migrations # 0.000 /sec
68 page-faults # 396.197 K/sec
867,913 cpu_core/cycles/ # 5.057 G/sec
<not counted> cpu_atom/cycles/ (0.00%)
1,029,910 cpu_core/instructions/ # 6.001 G/sec
<not counted> cpu_atom/instructions/ (0.00%)
202,177 cpu_core/branches/ # 1.178 G/sec
<not counted> cpu_atom/branches/ (0.00%)
6,623 cpu_core/branch-misses/ # 38.588 M/sec
<not counted> cpu_atom/branch-misses/ (0.00%)
0.000318438 seconds time elapsed
0.000341000 seconds user
0.000000000 seconds sys
dave@intel:~/obj$ ./bin/rr record date
rr: Saving execution to trace directory `/home/dave/.local/share/rr/date-1'.
[FATAL src/PerfCounters.cc:378:check_working_counters() errno: EDOM]
Got 0 branch events, expected at least 500.
The hardware performance counter seems to not be working. Check
that hardware performance counters are working by running
perf stat -e r5111c4 true
and checking that it reports a nonzero number of events.
If performance counters seem to be working with 'perf', file an
rr issue, otherwise check your hardware/OS/VM configuration. Also
check that other software is not using performance counters on
this CPU.
=== Start rr backtrace:
./bin/rr(_ZN2rr13dump_rr_stackEv+0x5a)[0x55f65f89530a]
./bin/rr(_ZN2rr15notifying_abortEv+0x14)[0x55f65f8977e4]
./bin/rr(+0x1f4854)[0x55f65f8b3854]
./bin/rr(_ZN2rr12PerfCounters5resetEl+0xc1c)[0x55f65f7a11cc]
./bin/rr(_ZN2rr4Task16resume_executionENS_13ResumeRequestENS_11WaitRequestENS_12TicksRequestEi+0x7f2)[0x55f65f869592]
./bin/rr(_ZN2rr13RecordSession13task_continueERKNS0_9StepStateE+0x33e)[0x55f65f7a796e]
./bin/rr(_ZN2rr13RecordSession11record_stepEv+0x34c)[0x55f65f7b6bec]
./bin/rr(_ZN2rr13RecordCommand3runERSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EE+0xc5d)[0x55f65f7a5e9d]
./bin/rr(main+0x1c8)[0x55f65f70bcc8]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7fbe93dadd90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7fbe93dade40]
./bin/rr(_start+0x25)[0x55f65f70be85]
=== End rr backtrace
Aborted (core dumped)
dave@intel:~/obj$ sudo perf stat -e r5111c4 true
WARNING: event 'N/A' not valid (bits 16,20,22 of config '5111c4' not supported by kernel)!
Performance counter stats for 'true':
93,138 cpu_core/r5111c4/
<not counted> cpu_atom/r5111c4/ (0.00%)
0.000496501 seconds time elapsed
0.000514000 seconds user
0.000000000 seconds sys
Fix for i9-12900K:
taskset -c 0-15 rr record [whatever]
Picking the CPU to bind on seems to work too.
./bin/rr record --bind-to-cpu=0 date
Disabling the cores works too, not sure if there's any performance benefit to doing so though.
for i in {16..23}; do echo 0 | sudo tee /sys/devices/system/cpu/cpu${i}/online; done
Source: https://unix.stackexchange.com/questions/686459/disable-intel-alder-lake-efficiency-cores-on-linux
Related: #2997 #3032
Can you see if r517ec4 works for the E cores?
dave@intel:~$ taskset -c 16-23 perf stat -e r517ec4 true
WARNING: event 'N/A' not valid (bits 16,20,22 of config '517ec4' not supported by kernel)!
Performance counter stats for 'true':
<not counted> cpu_core/r517ec4/ (0.00%)
94,944 cpu_atom/r517ec4/
0.000518950 seconds time elapsed
0.000551000 seconds user
0.000000000 seconds sys
Alright, seems to be working. We just need to hook up the core-specific perf-counter selection that we do on ARM then. Somewhat annoyingly, Intel microcode-updated all these chips to fake the CPUID to be the same on both cores, though there's a different CPUID leaf that can be used to detect that: https://www.intel.com/content/www/us/en/developer/articles/guide/12th-gen-intel-core-processor-gamedev-guide.html
I have run into this same issue with an i7-1250U. At least I found this ticket before creating a duplicate. Pinning to core 0 seems a sufficient workaround.
With intel-microcode 3.20230512.1 and Linux 6.1.0-10 (Debian) the cpuid utility (version 20230120) reports this as:
$ cpuid | egrep -i 'hybrid|core type'
hybrid part = true
core type = Intel Core
hybrid part = true
core type = Intel Core
hybrid part = true
core type = Intel Core
hybrid part = true
core type = Intel Core
hybrid part = true
core type = Intel Atom
hybrid part = true
core type = Intel Atom
hybrid part = true
core type = Intel Atom
hybrid part = true
core type = Intel Atom
hybrid part = true
core type = Intel Atom
hybrid part = true
core type = Intel Atom
hybrid part = true
core type = Intel Atom
hybrid part = true
core type = Intel Atom
Note "Core" vs. "Atom". I have hyperthreading enabled, so the two P cores appear as 4.
Same here for an i7-13700H (and the taskset -c 0-15 rr record [whatever] workaround works)
$ perf stat -e r5111c4 true
WARNING: event 'N/A' not valid (bits 16,20,22 of config '5111c4' not supported by kernel)!
Performance counter stats for 'true':
145 702 cpu_core/r5111c4/
<not counted> cpu_atom/r5111c4/ (0,00%)
0,001158560 seconds time elapsed
0,001197000 seconds user
0,000000000 seconds sys