pcp
pcp copied to clipboard
Perfevent configuration error for AMD chips
Hello we're using the performance copilot as a tool in one of our projects but we're having some issues related to monitoring AMD pmu events. I'd be very happy if you save some time and help me out with this issue.
we list all the available pmu in our machines and we generally use "hardware-specific" PMUs to monitor some predefined events. we're updating /var/lib/pcp/pmdas/perfevent/perfevent.conf file and reinstalling it with our configuration. this works well for intel cpus but it doesn't work for amd cpus. here are the details
this is on our intel machine, the perfevent.conf file and when we install it it works
This is one of our amd machines, as you can see it gives errors.
with PCP we list all available pmus along with their available events
This is for intel
And this is for the amd
ps: kernel paranoid is -1 on all of my machines
I can monitor [perf] events with success on both computers but this is not what we're interested in. My opinion is amd pmu names are not recognized by PCP but we couldn't fix it.
thanks in advance. Osman
@jpwhite4 @hkshaw1990 any clues for our friend here?
hello again we have been trying on different amd machines with different architectures but still no good do you have any updates regarding this issue?
@Osmanyasal seems like not - if you could provide a remote login to such a system, I could take a quick look for you.
I don't think i can because they're our school's computers. If you can describe us a starting point we can check in order to understand what's wrong. @FatihTasyaran
@Osmanyasal I was able to get a reservation an AMD machine today.
You'll find the list of supported names for your platform gets reported by the PCP perfevent agent in the file /var/log/pcp/pmcd/perfevent.log once its been ./Install'd for the first time.
You should be able to find the events you're interested in there and add them to a new section of perfevent.conf for your processor family (in /var/lib/pcp/pmdas/perfevent). I had no problems doing so with latest PCP code, so hopefully this is enough to get you started too.
cheers.
That's great it works now. but the issue is. we use showevtinfo (program in pcp) to report pmu names along with corresponding events and this tool reports pmu name as "amd64_fam17h_zen2 (AMD64 Fam17h Zen2)" so we took the first part as out pmu name but in log files it supports amd64_fam17h only (without _zen2).
however, when i checked the log file located at /var/log/pcp/pmcd/perfevent.log for our zen3 machine it only supports perf:: events. there's no other pmus such as amd64_fam19h. but when i checked the showevtinfo it says machine supports amd64_fam19h and there're many events related to. I installed pcp version 6.0.4-1 what could be the issue here?
| [...] showevtinfo (program in pcp)
This isn't a program from PCP, so I don't know what it is listing. The PMDA logfile is the one source of truth for PCP, those are all the hardware events that the kernel tells us about.
| [...] what could be the issue here?
The only other possible thing that might be involved would be a security system like SELinux - it might be preventing events being visible from a daemon (like pmdaperfevent) that are visible in a less restricted context like an interactive shell.
Either way, I don't think there's a PCP issue here (we regularly test with selinux here @Red Hat and there's no known issues).
Sorry for my misleading previous entry. showevtinfo is a demo program provided by libpfm4 that lists all available pmus and related events on the system.
since pcp uses libpfm4 for pmu event monitoring (as far as i know) i expect anything reported form libpfm4 should be valid for pcp as well -which it is.
I set kernel.paranoid to -1 to see and report pmu events. all these works for zen2 but didn't work on our zen3 machine, pcp doesn't display any pmus other than perf on our zen3 machine which i couldn't see why.
would you elaborate this phrase "The only other possible thing that might be involved would be a security system like SELinux - it might be preventing events being visible from a daemon (like pmdaperfevent) that are visible in a less restricted context like an interactive shell."
any breadcrumbs would be appreciated thank you in advance Osman.
If it was an selinux issue (unlikely) when you look in your syslog file you would see lots of AVC errors when pmdaperfevent attempts access via the kernel interface.