pcm icon indicating copy to clipboard operation
pcm copied to clipboard

A failure return when using the command "pcm 1 -i=1" on CentOS stream 8.

Open thebestsuper opened this issue 2 years ago • 2 comments

Hello, I use pcm tool to run on Intel CPU, the platform is Ice Lake. I found that the problem with the pcm tool happens randomly. When the problem occurs, it always shows like this: "Try using Linux performance events..." "Debug: Catch signal breaks (segmentation fault)."

The pcm tool I build on CentOS 8.5 and use CenOS stream 8 The attach file is pcm log and my OS-realease informatioin.

Please help me to check if it is a pcm tool problem. If there is anything I can do to help please let me know and I will do it asap.

Best Regards thebestsuper os-release.txt Intel_UPE_SpecificationsChart_2022_08_04.csv PCM_QPI_dump_20220804_1212_error.txt

thebestsuper avatar Aug 04 '22 06:08 thebestsuper

thanks for reporting. We are looking into it. Does it crash if you run it with PCM_NO_PERF=1 env variable:

PCM_NO_PERF=1 pcm 1 -i=1

opcm avatar Aug 04 '22 07:08 opcm

Sorry for replying so late, After adding "PCM_NO_PERF=1", the failure problem no longer occurs. From the parameter description I only know that this parameter is used to program the core PMU without using the Linux perf events API (it is used by default). I wonder if this change might cause any difference to the previous test without adding "PCM_NO_PERF=1". If this change is the same as before, I will have no problems.

Best Regards thebestsuper

thebestsuper avatar Aug 09 '22 08:08 thebestsuper

Sorry for the delayed reply. In this mode a different driver is used but the data output should be the same. Nevertheless we should understand the issue. Unfortunately I was not able to reproduce it. It would be very much appreciated if you could help with additional information: the call stack of the crash.

Please install gdb on your system. Then compile pcm in debug mode:

mkdir Debug cd Debug cmake -DCMAKE_BUILD_TYPE=Debug .. make

Then run pcm in gdb: gdb bin/pcm run 1 -i=1

When it crashes type bt and share the crash call stack which will be output by the bt command.

Thanks a lot in advance

opcm avatar Aug 15 '22 12:08 opcm

Hi developers, Sorry for replying so late, I used gdb to get pcm info on my machine and the log message is in the attachment. Since it happens randomly, I tried several rounds with about 5 seconds between each round. The figure below shows the failed state. PCM_OnGdb_Fail The figure below shows the PASS state. PCM_OnGdb_PASS

thebestsuper avatar Aug 22 '22 08:08 thebestsuper

Your "this" pointer is bogus in the failed state. Can you type in the command "bt" when it fails again?

ogbrugge avatar Aug 22 '22 09:08 ogbrugge

Sorry, I forgot to use the "bt" command to get the info. The figure below is the status failure, use the 'bt' command to get information. gdb.txt

thebestsuper avatar Aug 23 '22 05:08 thebestsuper

thanks a lot. The 'bt' output was very helpful. I was able to identify the issue and we will work on a fix.

opcm avatar Aug 23 '22 07:08 opcm

could you please try the latest version from the master branch? it should resolve the issue

opcm avatar Aug 24 '22 14:08 opcm

Hi developers, After using the new pcm tool on the machine, the problem no longer occurs. Thanks for solving this problem.

thebestsuper avatar Aug 29 '22 12:08 thebestsuper