PerfCounters: Add support for AMD Family 15h Model 2 (Piledriver)
Extends existing Family 15h Model 30 (Steamroller) support for Piledriver. Piledriver supports PMCx0C4 (Retired Taken Branch Instructions) and PMCx0C6 (Retired Far Control Transfer), just like Model 30h. 1
Note that PMCx0C4 counts all control flow changes, including exceptions and interrupts. AFAICT on 15h, there is no PMC for just retired conditional branches.
Tested:
-
counters-test: vsrinivas@ubuntu:~/tmp/rr/src/counters-test$ cc -O2 counters.c vsrinivas@ubuntu:~/tmp/rr/src/counters-test$ sudo ./a.out Ticks mismatch; got 1003, expected 1002 Aborted
Varying the number of volatile matches, we always see one more tick than expected, which I think is a RET instruction.
-
ctest: 97% tests passed, 45 tests failed out of 1425
Total Test time (real) = 3092.96 sec
The following tests FAILED: 53 - x86/chew_cpu_cpuid-no-syscallbuf (Failed) 110 - detach_state (Failed) 162 - x86/fault_in_code_page (Failed) 558 - x86/rdtsc_flags (Failed) 724 - sioc (Failed) 725 - sioc-no-syscallbuf (Failed) 842 - vsyscall (Failed) 843 - vsyscall-no-syscallbuf (Failed) 844 - vsyscall_timeslice (Failed) 845 - vsyscall_timeslice-no-syscallbuf (Failed) 846 - x86/x87env (Failed) 847 - x86/x87env-no-syscallbuf (Failed) 888 - async_signal_syscalls (Failed) 890 - async_signal_syscalls2 (Failed) 910 - x86/blocked_sigsegv (Failed) 916 - breakpoint_overlap (Failed) 924 - checkpoint_dying_threads (Failed) 932 - clone_interruption (Failed) 938 - conditional_breakpoint_offload (Failed) 939 - conditional_breakpoint_offload-no-syscallbuf (Failed) 951 - daemon_read-no-syscallbuf (Failed) 962 - dlopen (Failed) 980 - exit_race (Failed) 981 - exit_race-no-syscallbuf (Failed) 984 - x86/explicit_checkpoints (Failed) 1072 - x86/rdtsc_loop (Failed) 1080 - reverse_continue_breakpoint (Failed) 1081 - reverse_continue_breakpoint-no-syscallbuf (Failed) 1089 - reverse_step_long-no-syscallbuf (Failed) 1092 - reverse_step_threads_break (Failed) 1100 - rseq_syscallbuf (Failed) 1130 - strict_priorities (Failed) 1131 - strict_priorities-no-syscallbuf (Failed) 1150 - x86/syscallbuf_rdtsc_page (Failed) 1174 - thread_open_race (Failed) 1206 - watchpoint_at_sched (Failed) 1208 - watchpoint_before_signal (Failed) 1209 - watchpoint_before_signal-no-syscallbuf (Failed) 1218 - async_signal_syscalls_100 (Failed) 1219 - async_signal_syscalls_100-no-syscallbuf (Failed) 1320 - record_replay (Failed) 1321 - record_replay-no-syscallbuf (Failed) 1354 - reverse_watchpoint_syscall (Failed) 1418 - vsyscall_singlestep (Failed) 1419 - vsyscall_singlestep-no-syscallbuf (Failed)
Note that PMCx0C4 counts all control flow changes, including exceptions and interrupts. AFAICT on 15h, there is no PMC for just retired conditional branches.
That's going to be a fatal problem if true because interrupts are not deterministic.
From the 15h Model 00-0Fh BKDG:
PMCx0C4 Retired Taken Branch Instructions
PERF_CTL[5:0]. The number of taken branches that were retired. This includes all types of architectural control flow changes, including exceptions and interrupts
What we do right now (for model 30h) is subtract out PMCx0C6 (minus_ticks_cntr_event), which counts "The number of far control transfers retired including far call/jump/return, IRET, SYSCALL and SYSRET, plus exceptions and interrupts". This patch does the same for 15h model 2.
@vsrinivas I do wonder if you have access to that CPU and would rebase + retest for failing tests. Was the resulting rr usable for you on that machine or not?