pcm
pcm copied to clipboard
A failure return when using the command "pcm 1 -i=1" on CentOS stream 8.
Hello, I use pcm tool to run on Intel CPU, the platform is Ice Lake. I found that the problem with the pcm tool happens randomly. When the problem occurs, it always shows like this: "Try using Linux performance events..." "Debug: Catch signal breaks (segmentation fault)."
The pcm tool I build on CentOS 8.5 and use CenOS stream 8 The attach file is pcm log and my OS-realease informatioin.
Please help me to check if it is a pcm tool problem. If there is anything I can do to help please let me know and I will do it asap.
Best Regards thebestsuper os-release.txt Intel_UPE_SpecificationsChart_2022_08_04.csv PCM_QPI_dump_20220804_1212_error.txt
thanks for reporting. We are looking into it. Does it crash if you run it with PCM_NO_PERF=1 env variable:
PCM_NO_PERF=1 pcm 1 -i=1
Sorry for replying so late, After adding "PCM_NO_PERF=1", the failure problem no longer occurs. From the parameter description I only know that this parameter is used to program the core PMU without using the Linux perf events API (it is used by default). I wonder if this change might cause any difference to the previous test without adding "PCM_NO_PERF=1". If this change is the same as before, I will have no problems.
Best Regards thebestsuper
Sorry for the delayed reply. In this mode a different driver is used but the data output should be the same. Nevertheless we should understand the issue. Unfortunately I was not able to reproduce it. It would be very much appreciated if you could help with additional information: the call stack of the crash.
Please install gdb on your system. Then compile pcm in debug mode:
mkdir Debug cd Debug cmake -DCMAKE_BUILD_TYPE=Debug .. make
Then run pcm in gdb: gdb bin/pcm run 1 -i=1
When it crashes type bt
and share the crash call stack which will be output by the bt
command.
Thanks a lot in advance
Hi developers,
Sorry for replying so late, I used gdb to get pcm info on my machine and the log message is in the attachment.
Since it happens randomly, I tried several rounds with about 5 seconds between each round.
The figure below shows the failed state.
The figure below shows the PASS state.
Your "this" pointer is bogus in the failed state. Can you type in the command "bt" when it fails again?
Sorry, I forgot to use the "bt" command to get the info. The figure below is the status failure, use the 'bt' command to get information. gdb.txt
thanks a lot. The 'bt' output was very helpful. I was able to identify the issue and we will work on a fix.
could you please try the latest version from the master branch? it should resolve the issue
Hi developers, After using the new pcm tool on the machine, the problem no longer occurs. Thanks for solving this problem.