pmu-tools
pmu-tools copied to clipboard
Latest Toplev reports incorrect Time and Ip* metrics
a new bug got induced where it now reports zero Time and incorrect Ip* info metrics. Reproducer on ICX before/after.
New toplev
$ ./pmu-tools/toplev.py --no-desc -vl1 --nodes '+CoreIPC,+Instructions,+CORE_CLKS,+CPU_Utilization,+Time,+MUX,+L2MPKI,+IpTB,+IpMispredict' -V CLTRAMP3D-t7.toplev-vl1-perf.csv --metric-group +Summary -- ./pmu-tools/workloads/CLTRAMP3D
# 4.3-full-perf on Genuine Intel(R) CPU $0000%@ [icx/icelake]
FE Frontend_Bound % Slots 47.7 <==
BAD Bad_Speculation % Slots 10.5 <
BE Backend_Bound % Slots 11.6 <
RET Retiring % Slots 30.3 <
Info.Core CoreIPC Core_Metric 1.58
Info.Inst_Mix Instructions Count 24,788,089,890
Info.Inst_Mix IpTB Inst_Metric 1.00
Info.Thread IPC Metric 1.58
Info.System CPU_Utilization Metric 1.00
Info.System Time Seconds 0.00
Info.Core IpMispredict Inst_Metric 0.00
Info.Core CORE_CLKS Count 15,708,357,407
Info.Memory L2MPKI Metric 0.81
MUX % 100.00
Run toplev --describe Frontend_Bound^ to get more information on bottleneck
Add --run-sample to find locations
Add --nodes '!+Frontend_Bound*/2' for breakdown.
**previous toplev (TMA 4.2) shows valid Time and correct IpTB and IpMispredict **
admin1@icx-srv03:~/ayasin/perf-tools$ ./perf-tools/pmu-tools/toplev.py --no-desc -vl1 --nodes '+CoreIPC,+Instructions,+CORE_CLKS,+CPU_Utilization,+Time,+MUX,+L2MPKI,+IpTB,+IpMispredict' -V CLTRAMP3D-t7.toplev-vl1-perf.csv --metric-group +Summary -- ./pmu-tools/workloads/CLTRAMP3D
# 4.2-full on Genuine Intel(R) CPU $0000%@ [icx/icelake]
FE Frontend_Bound % Slots 47.7 <==
BAD Bad_Speculation % Slots 10.1 <
BE Backend_Bound % Slots 11.6 <
RET Retiring % Slots 30.6 <
Info.Core CoreIPC CoreMetric 1.58
Info.Inst_Mix Instructions Count 24,752,063,989.0
Info.Thread IPC Metric 1.58
Info.System CPU_Utilization Metric 1.00
Info.System Time Seconds 5.54
Info.Thread IpTB Metric 8.17
Info.Core IpMispredict Metric 301.46
Info.Core CORE_CLKS Count 15,681,011,129.0
Info.Memory L2MPKI Metric 0.81
MUX % 100.00
Run toplev --describe Frontend_Bound^ to get more information on bottleneck
Add --run-sample to find locations
Add --nodes '!+Frontend_Bound*/2' for breakdown.
Cannot reproduce here. Do you still see it?
It might be related to an old perf that doesn't support duration_time, check perf list | grep duration_time