rr icon indicating copy to clipboard operation
rr copied to clipboard

PerfCounters: Add support for AMD Family 15h Model 2 (Piledriver)

Open vsrinivas opened this issue 3 years ago • 3 comments

Extends existing Family 15h Model 30 (Steamroller) support for Piledriver. Piledriver supports PMCx0C4 (Retired Taken Branch Instructions) and PMCx0C6 (Retired Far Control Transfer), just like Model 30h. 1

Note that PMCx0C4 counts all control flow changes, including exceptions and interrupts. AFAICT on 15h, there is no PMC for just retired conditional branches.

Tested:

  1. counters-test: vsrinivas@ubuntu:~/tmp/rr/src/counters-test$ cc -O2 counters.c vsrinivas@ubuntu:~/tmp/rr/src/counters-test$ sudo ./a.out Ticks mismatch; got 1003, expected 1002 Aborted

    Varying the number of volatile matches, we always see one more tick than expected, which I think is a RET instruction.

  2. ctest: 97% tests passed, 45 tests failed out of 1425

Total Test time (real) = 3092.96 sec

The following tests FAILED: 53 - x86/chew_cpu_cpuid-no-syscallbuf (Failed) 110 - detach_state (Failed) 162 - x86/fault_in_code_page (Failed) 558 - x86/rdtsc_flags (Failed) 724 - sioc (Failed) 725 - sioc-no-syscallbuf (Failed) 842 - vsyscall (Failed) 843 - vsyscall-no-syscallbuf (Failed) 844 - vsyscall_timeslice (Failed) 845 - vsyscall_timeslice-no-syscallbuf (Failed) 846 - x86/x87env (Failed) 847 - x86/x87env-no-syscallbuf (Failed) 888 - async_signal_syscalls (Failed) 890 - async_signal_syscalls2 (Failed) 910 - x86/blocked_sigsegv (Failed) 916 - breakpoint_overlap (Failed) 924 - checkpoint_dying_threads (Failed) 932 - clone_interruption (Failed) 938 - conditional_breakpoint_offload (Failed) 939 - conditional_breakpoint_offload-no-syscallbuf (Failed) 951 - daemon_read-no-syscallbuf (Failed) 962 - dlopen (Failed) 980 - exit_race (Failed) 981 - exit_race-no-syscallbuf (Failed) 984 - x86/explicit_checkpoints (Failed) 1072 - x86/rdtsc_loop (Failed) 1080 - reverse_continue_breakpoint (Failed) 1081 - reverse_continue_breakpoint-no-syscallbuf (Failed) 1089 - reverse_step_long-no-syscallbuf (Failed) 1092 - reverse_step_threads_break (Failed) 1100 - rseq_syscallbuf (Failed) 1130 - strict_priorities (Failed) 1131 - strict_priorities-no-syscallbuf (Failed) 1150 - x86/syscallbuf_rdtsc_page (Failed) 1174 - thread_open_race (Failed) 1206 - watchpoint_at_sched (Failed) 1208 - watchpoint_before_signal (Failed) 1209 - watchpoint_before_signal-no-syscallbuf (Failed) 1218 - async_signal_syscalls_100 (Failed) 1219 - async_signal_syscalls_100-no-syscallbuf (Failed) 1320 - record_replay (Failed) 1321 - record_replay-no-syscallbuf (Failed) 1354 - reverse_watchpoint_syscall (Failed) 1418 - vsyscall_singlestep (Failed) 1419 - vsyscall_singlestep-no-syscallbuf (Failed)

vsrinivas avatar Nov 23 '22 01:11 vsrinivas

Note that PMCx0C4 counts all control flow changes, including exceptions and interrupts. AFAICT on 15h, there is no PMC for just retired conditional branches.

That's going to be a fatal problem if true because interrupts are not deterministic.

khuey avatar Nov 23 '22 01:11 khuey

From the 15h Model 00-0Fh BKDG:

PMCx0C4 Retired Taken Branch Instructions
PERF_CTL[5:0]. The number of taken branches that were retired. This includes all types of architectural control flow changes, including exceptions and interrupts

What we do right now (for model 30h) is subtract out PMCx0C6 (minus_ticks_cntr_event), which counts "The number of far control transfers retired including far call/jump/return, IRET, SYSCALL and SYSRET, plus exceptions and interrupts". This patch does the same for 15h model 2.

vsrinivas avatar Nov 23 '22 01:11 vsrinivas

@vsrinivas I do wonder if you have access to that CPU and would rebase + retest for failing tests. Was the resulting rr usable for you on that machine or not?

GitMensch avatar Oct 25 '23 08:10 GitMensch