rr icon indicating copy to clipboard operation
rr copied to clipboard

E cores?

Open Manouchehri opened this issue 3 years ago • 9 comments

At first glance, it looks like the E cores on newer Intel CPUs don't have perf counters. Can/should rr automatically pin itself to only run on cores that are going to work?

dave@intel:~/obj$ perf stat -ddd /bin/date
Mon Jul 25 01:11:14 AM CEST 2022

 Performance counter stats for '/bin/date':

              0.17 msec task-clock                #    0.539 CPUs utilized
                 0      context-switches          #    0.000 /sec
                 0      cpu-migrations            #    0.000 /sec
                68      page-faults               #  396.197 K/sec
           867,913      cpu_core/cycles/          #    5.057 G/sec
     <not counted>      cpu_atom/cycles/                                              (0.00%)
         1,029,910      cpu_core/instructions/    #    6.001 G/sec
     <not counted>      cpu_atom/instructions/                                        (0.00%)
           202,177      cpu_core/branches/        #    1.178 G/sec
     <not counted>      cpu_atom/branches/                                            (0.00%)
             6,623      cpu_core/branch-misses/   #   38.588 M/sec
     <not counted>      cpu_atom/branch-misses/                                       (0.00%)

       0.000318438 seconds time elapsed

       0.000341000 seconds user
       0.000000000 seconds sys
dave@intel:~/obj$ ./bin/rr record date
rr: Saving execution to trace directory `/home/dave/.local/share/rr/date-1'.
[FATAL src/PerfCounters.cc:378:check_working_counters() errno: EDOM]
Got 0 branch events, expected at least 500.

The hardware performance counter seems to not be working. Check
that hardware performance counters are working by running
  perf stat -e r5111c4 true
and checking that it reports a nonzero number of events.
If performance counters seem to be working with 'perf', file an
rr issue, otherwise check your hardware/OS/VM configuration. Also
check that other software is not using performance counters on
this CPU.
=== Start rr backtrace:
./bin/rr(_ZN2rr13dump_rr_stackEv+0x5a)[0x55f65f89530a]
./bin/rr(_ZN2rr15notifying_abortEv+0x14)[0x55f65f8977e4]
./bin/rr(+0x1f4854)[0x55f65f8b3854]
./bin/rr(_ZN2rr12PerfCounters5resetEl+0xc1c)[0x55f65f7a11cc]
./bin/rr(_ZN2rr4Task16resume_executionENS_13ResumeRequestENS_11WaitRequestENS_12TicksRequestEi+0x7f2)[0x55f65f869592]
./bin/rr(_ZN2rr13RecordSession13task_continueERKNS0_9StepStateE+0x33e)[0x55f65f7a796e]
./bin/rr(_ZN2rr13RecordSession11record_stepEv+0x34c)[0x55f65f7b6bec]
./bin/rr(_ZN2rr13RecordCommand3runERSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EE+0xc5d)[0x55f65f7a5e9d]
./bin/rr(main+0x1c8)[0x55f65f70bcc8]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7fbe93dadd90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7fbe93dade40]
./bin/rr(_start+0x25)[0x55f65f70be85]
=== End rr backtrace
Aborted (core dumped)
dave@intel:~/obj$ sudo perf stat -e r5111c4 true
WARNING: event 'N/A' not valid (bits 16,20,22 of config '5111c4' not supported by kernel)!

 Performance counter stats for 'true':

            93,138      cpu_core/r5111c4/
     <not counted>      cpu_atom/r5111c4/                                             (0.00%)

       0.000496501 seconds time elapsed

       0.000514000 seconds user
       0.000000000 seconds sys

Fix for i9-12900K:

taskset -c 0-15 rr record [whatever]

Picking the CPU to bind on seems to work too.

./bin/rr record --bind-to-cpu=0 date

Disabling the cores works too, not sure if there's any performance benefit to doing so though.

for i in {16..23}; do echo 0 | sudo tee /sys/devices/system/cpu/cpu${i}/online; done

Source: https://unix.stackexchange.com/questions/686459/disable-intel-alder-lake-efficiency-cores-on-linux

Related: #2997 #3032

Manouchehri avatar Jul 24 '22 23:07 Manouchehri

Can you see if r517ec4 works for the E cores?

Keno avatar Jul 24 '22 23:07 Keno


dave@intel:~$ taskset -c 16-23 perf stat -e r517ec4 true
WARNING: event 'N/A' not valid (bits 16,20,22 of config '517ec4' not supported by kernel)!

 Performance counter stats for 'true':

     <not counted>      cpu_core/r517ec4/                                             (0.00%)
            94,944      cpu_atom/r517ec4/

       0.000518950 seconds time elapsed

       0.000551000 seconds user
       0.000000000 seconds sys

Manouchehri avatar Jul 24 '22 23:07 Manouchehri

Alright, seems to be working. We just need to hook up the core-specific perf-counter selection that we do on ARM then. Somewhat annoyingly, Intel microcode-updated all these chips to fake the CPUID to be the same on both cores, though there's a different CPUID leaf that can be used to detect that: https://www.intel.com/content/www/us/en/developer/articles/guide/12th-gen-intel-core-processor-gamedev-guide.html

Keno avatar Jul 24 '22 23:07 Keno

I have run into this same issue with an i7-1250U. At least I found this ticket before creating a duplicate. Pinning to core 0 seems a sufficient workaround.

With intel-microcode 3.20230512.1 and Linux 6.1.0-10 (Debian) the cpuid utility (version 20230120) reports this as:

$ cpuid  | egrep -i 'hybrid|core type'
      hybrid part                              = true
      core type               = Intel Core
      hybrid part                              = true
      core type               = Intel Core
      hybrid part                              = true
      core type               = Intel Core
      hybrid part                              = true
      core type               = Intel Core
      hybrid part                              = true
      core type               = Intel Atom
      hybrid part                              = true
      core type               = Intel Atom
      hybrid part                              = true
      core type               = Intel Atom
      hybrid part                              = true
      core type               = Intel Atom
      hybrid part                              = true
      core type               = Intel Atom
      hybrid part                              = true
      core type               = Intel Atom
      hybrid part                              = true
      core type               = Intel Atom
      hybrid part                              = true
      core type               = Intel Atom

Note "Core" vs. "Atom". I have hyperthreading enabled, so the two P cores appear as 4.

mdavidsaver avatar Aug 10 '23 20:08 mdavidsaver

Same here for an i7-13700H (and the taskset -c 0-15 rr record [whatever] workaround works)

$ perf stat -e r5111c4 true
WARNING: event 'N/A' not valid (bits 16,20,22 of config '5111c4' not supported by kernel)!

 Performance counter stats for 'true':

           145 702      cpu_core/r5111c4/                                                     
     <not counted>      cpu_atom/r5111c4/                                                       (0,00%)

       0,001158560 seconds time elapsed

       0,001197000 seconds user
       0,000000000 seconds sys

jtunhag avatar Nov 13 '23 19:11 jtunhag