autoperf icon indicating copy to clipboard operation
autoperf copied to clipboard

autoperf in 2022 on cascade lake cpu

Open tom-wegener opened this issue 3 years ago • 7 comments

Describe the bug I want to use autoperf for my master thesis where I look into the performance of applications with a focus on special hardware-parts. I want to profile a small program on my DUT but autoperf does not produce all of the csv-files. I think it has something to do with the cascade-lake-cpu and autoperf trying to add "fc_mask" to the perf command for "uncore_iio_free_running_5". There are two error-messages that are most important for this issue:

[2022-04-21T13:27:34Z ERROR autoperf::profile] perf command: perf stat -aA -I 250 -x \';\' -o out/1_stat.csv 
-e uncore_m2m_1/name=uncore_m2m_1.UNC_M2M_VERT_RING_BL_IN_USE.DN_EVEN,event=0xaa,umask=0x4/S 
-e uncore_m2m_0/name=uncore_m2m_0.UNC_M2M_VERT_RING_BL_IN_USE.DN_EVEN,event=0xaa,umask=0x4/S 
-e uncore_cha_1/name=uncore_cha_1.UNC_H_HITME_MISS.READ_OR_INV,event=0x60,umask=0x80/S 
-e uncore_cha_8/name=uncore_cha_8.UNC_H_HITME_MISS.READ_OR_INV,event=0x60,umask=0x80/S 
-e uncore_cha_6/name=uncore_cha_6.UNC_H_HITME_MISS.READ_OR_INV,event=0x60,umask=0x80/S 
-e uncore_cha_4/name=uncore_cha_4.UNC_H_HITME_MISS.READ_OR_INV,event=0x60,umask=0x80/S 
-e uncore_cha_2/name=uncore_cha_2.UNC_H_HITME_MISS.READ_OR_INV,event=0x60,umask=0x80/S 
-e uncore_cha_0/name=uncore_cha_0.UNC_H_HITME_MISS.READ_OR_INV,event=0x60,umask=0x80/S 
-e uncore_cha_9/name=uncore_cha_9.UNC_H_HITME_MISS.READ_OR_INV,event=0x60,umask=0x80/S 
-e uncore_cha_7/name=uncore_cha_7.UNC_H_HITME_MISS.READ_OR_INV,event=0x60,umask=0x80/S 
-e uncore_cha_5/name=uncore_cha_5.UNC_H_HITME_MISS.READ_OR_INV,event=0x60,umask=0x80/S 
-e uncore_cha_3/name=uncore_cha_3.UNC_H_HITME_MISS.READ_OR_INV,event=0x60,umask=0x80/S 
-e uncore_m3upi_0/name=uncore_m3upi_0.UNC_M3UPI_TxC_BL_FLQ_INSERTS.VN0_RSP,event=0x2e,umask=0x8/S 
-e uncore_m3upi_1/name=uncore_m3upi_1.UNC_M3UPI_TxC_BL_FLQ_INSERTS.VN0_RSP,event=0x2e,umask=0x8/S 
-e uncore_iio_free_running_5/name=uncore_iio_free_running_5.UNC_IIO_TXN_REQ_BY_CPU.CFG_READ.PART1,event=0xc1,umask=0x40,fc_mask=0x7,ch_mask=0x2/S 
-e uncore_iio_free_running_3/name=uncore_iio_free_running_3.UNC_IIO_TXN_REQ_BY_CPU.CFG_READ.PART1,event=0xc1,umask=0x40,fc_mask=0x7,ch_mask=0x2/S 
-e uncore_iio_free_running_1/name=uncore_iio_free_running_1.UNC_IIO_TXN_REQ_BY_CPU.CFG_READ.PART1,event=0xc1,umask=0x40,fc_mask=0x7,ch_mask=0x2/S 
-e uncore_iio_4/name=uncore_iio_4.UNC_IIO_TXN_REQ_BY_CPU.CFG_READ.PART1,event=0xc1,umask=0x40,fc_mask=0x7,ch_mask=0x2/S 
-e uncore_iio_2/name=uncore_iio_2.UNC_IIO_TXN_REQ_BY_CPU.CFG_READ.PART1,event=0xc1,umask=0x40,fc_mask=0x7,ch_mask=0x2/S 
-e uncore_iio_0/name=uncore_iio_0.UNC_IIO_TXN_REQ_BY_CPU.CFG_READ.PART1,event=0xc1,umask=0x40,fc_mask=0x7,ch_mask=0x2/S 
-e uncore_iio_free_running_4/name=uncore_iio_free_running_4.UNC_IIO_TXN_REQ_BY_CPU.CFG_READ.PART1,event=0xc1,umask=0x40,fc_mask=0x7,ch_mask=0x2/S 
-e uncore_iio_free_running_2/name=uncore_iio_free_running_2.UNC_IIO_TXN_REQ_BY_CPU.CFG_READ.PART1,event=0xc1,umask=0x40,fc_mask=0x7,ch_mask=0x2/S 
-e uncore_iio_5/name=uncore_iio_5.UNC_IIO_TXN_REQ_BY_CPU.CFG_READ.PART1,event=0xc1,umask=0x40,fc_mask=0x7,ch_mask=0x2/S 
-e uncore_iio_free_running_0/name=uncore_iio_free_running_0.UNC_IIO_TXN_REQ_BY_CPU.CFG_READ.PART1,event=0xc1,umask=0x40,fc_mask=0x7,ch_mask=0x2/S 
-e uncore_iio_3/name=uncore_iio_3.UNC_IIO_TXN_REQ_BY_CPU.CFG_READ.PART1,event=0xc1,umask=0x40,fc_mask=0x7,ch_mask=0x2/S 
-e uncore_iio_1/name=uncore_iio_1.UNC_IIO_TXN_REQ_BY_CPU.CFG_READ.PART1,event=0xc1,umask=0x40,fc_mask=0x7,ch_mask=0x2/S 
-e cpu/name=OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS.REMOTE_HIT_FORWARD,event=0xb7,umask=0x1,offcore_rsp=0x83fc00100/S 
-e uncore_imc_5/name=uncore_imc_5.UNC_M_WR_CAS_RANK1.BANK9,event=0xb9,umask=0x9/S 
-e uncore_imc_3/name=uncore_imc_3.UNC_M_WR_CAS_RANK1.BANK9,event=0xb9,umask=0x9/S 
-e uncore_imc_1/name=uncore_imc_1.UNC_M_WR_CAS_RANK1.BANK9,event=0xb9,umask=0x9/S 
-e uncore_imc_4/name=uncore_imc_4.UNC_M_WR_CAS_RANK1.BANK9,event=0xb9,umask=0x9/S 
-e uncore_imc_2/name=uncore_imc_2.UNC_M_WR_CAS_RANK1.BANK9,event=0xb9,umask=0x9/S 
-e uncore_imc_0/name=uncore_imc_0.UNC_M_WR_CAS_RANK1.BANK9,event=0xb9,umask=0x9/S 
-e cpu/name=OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS.REMOTE_HIT_FORWARD,event=0xbb,umask=0x1,offcore_rsp=0x83fc00080/S 
-e uncore_cha_1/name=uncore_cha_1.UNC_CHA_TOR_INSERTS.ALL_MISS,event=0x35,umask=0x25/S 
-e uncore_cha_8/name=uncore_cha_8.UNC_CHA_TOR_INSERTS.ALL_MISS,event=0x35,umask=0x25/S 
-e uncore_cha_6/name=uncore_cha_6.UNC_CHA_TOR_INSERTS.ALL_MISS,event=0x35,umask=0x25/S 
-e uncore_cha_4/name=uncore_cha_4.UNC_CHA_TOR_INSERTS.ALL_MISS,event=0x35,umask=0x25/S 
-e uncore_cha_2/name=uncore_cha_2.UNC_CHA_TOR_INSERTS.ALL_MISS,event=0x35,umask=0x25/S 
-e uncore_cha_0/name=uncore_cha_0.UNC_CHA_TOR_INSERTS.ALL_MISS,event=0x35,umask=0x25/S 
-e uncore_cha_9/name=uncore_cha_9.UNC_CHA_TOR_INSERTS.ALL_MISS,event=0x35,umask=0x25/S 
-e uncore_cha_7/name=uncore_cha_7.UNC_CHA_TOR_INSERTS.ALL_MISS,event=0x35,umask=0x25/S 
-e uncore_cha_5/name=uncore_cha_5.UNC_CHA_TOR_INSERTS.ALL_MISS,event=0x35,umask=0x25/S 
-e uncore_cha_3/name=uncore_cha_3.UNC_CHA_TOR_INSERTS.ALL_MISS,event=0x35,umask=0x25/S 
-e uncore_cha_1/name=uncore_cha_1.UNC_CHA_LLC_LOOKUP.WRITE,event=0x34,umask=0x5/S 
-e uncore_cha_8/name=uncore_cha_8.UNC_CHA_LLC_LOOKUP.WRITE,event=0x34,umask=0x5/S 
-e uncore_cha_6/name=uncore_cha_6.UNC_CHA_LLC_LOOKUP.WRITE,event=0x34,umask=0x5/S 
-e uncore_cha_4/name=uncore_cha_4.UNC_CHA_LLC_LOOKUP.WRITE,event=0x34,umask=0x5/S 
-e uncore_cha_2/name=uncore_cha_2.UNC_CHA_LLC_LOOKUP.WRITE,event=0x34,umask=0x5/S 
-e uncore_cha_0/name=uncore_cha_0.UNC_CHA_LLC_LOOKUP.WRITE,event=0x34,umask=0x5/S 
-e uncore_cha_9/name=uncore_cha_9.UNC_CHA_LLC_LOOKUP.WRITE,event=0x34,umask=0x5/S 
-e uncore_cha_7/name=uncore_cha_7.UNC_CHA_LLC_LOOKUP.WRITE,event=0x34,umask=0x5/S 
-e uncore_cha_5/name=uncore_cha_5.UNC_CHA_LLC_LOOKUP.WRITE,event=0x34,umask=0x5/S 
-e uncore_cha_3/name=uncore_cha_3.UNC_CHA_LLC_LOOKUP.WRITE,event=0x34,umask=0x5/S 
-e uncore_upi_1/name=uncore_upi_1.UNC_UPI_TxL_HDR_MATCH.NCB,event=0x4,umask=0xe/S 
-e uncore_upi_0/name=uncore_upi_0.UNC_UPI_TxL_HDR_MATCH.NCB,event=0x4,umask=0xe/S 
-e uncore_irp_3/name=uncore_irp_3.UNC_I_SNOOP_RESP.SNPCODE,event=0x12,umask=0x10/S 
-e uncore_irp_1/name=uncore_irp_1.UNC_I_SNOOP_RESP.SNPCODE,event=0x12,umask=0x10/S 
-e uncore_irp_4/name=uncore_irp_4.UNC_I_SNOOP_RESP.SNPCODE,event=0x12,umask=0x10/S 
-e uncore_irp_2/name=uncore_irp_2.UNC_I_SNOOP_RESP.SNPCODE,event=0x12,umask=0x10/S 
-e uncore_irp_0/name=uncore_irp_0.UNC_I_SNOOP_RESP.SNPCODE,event=0x12,umask=0x10/S 
-e uncore_irp_5/name=uncore_irp_5.UNC_I_SNOOP_RESP.SNPCODE,event=0x12,umask=0x10/S 
-e uncore_ubox/name=uncore_ubox.UNC_U_EVENT_MSG.IPI_RCVD,event=0x42,umask=0x4/S 
-e uncore_cha_1/name=uncore_cha_1.UNC_CHA_UPI_CREDIT_OCCUPANCY.VN0_BL_NCB,event=0x3b,umask=0x40/S 
-e uncore_cha_8/name=uncore_cha_8.UNC_CHA_UPI_CREDIT_OCCUPANCY.VN0_BL_NCB,event=0x3b,umask=0x40/S 
-e uncore_cha_6/name=uncore_cha_6.UNC_CHA_UPI_CREDIT_OCCUPANCY.VN0_BL_NCB,event=0x3b,umask=0x40/S 
-e uncore_cha_4/name=uncore_cha_4.UNC_CHA_UPI_CREDIT_OCCUPANCY.VN0_BL_NCB,event=0x3b,umask=0x40/S 
-e uncore_cha_2/name=uncore_cha_2.UNC_CHA_UPI_CREDIT_OCCUPANCY.VN0_BL_NCB,event=0x3b,umask=0x40/S 
-e uncore_cha_0/name=uncore_cha_0.UNC_CHA_UPI_CREDIT_OCCUPANCY.VN0_BL_NCB,event=0x3b,umask=0x40/S 
-e uncore_cha_9/name=uncore_cha_9.UNC_CHA_UPI_CREDIT_OCCUPANCY.VN0_BL_NCB,event=0x3b,umask=0x40/S 
-e uncore_cha_7/name=uncore_cha_7.UNC_CHA_UPI_CREDIT_OCCUPANCY.VN0_BL_NCB,event=0x3b,umask=0x40/S 
-e uncore_cha_5/name=uncore_cha_5.UNC_CHA_UPI_CREDIT_OCCUPANCY.VN0_BL_NCB,event=0x3b,umask=0x40/S 
-e uncore_cha_3/name=uncore_cha_3.UNC_CHA_UPI_CREDIT_OCCUPANCY.VN0_BL_NCB,event=0x3b,umask=0x40/S 
-e uncore_iio_free_running_5/name=uncore_iio_free_running_5.UNC_IIO_VTD_ACCESS.TLB1_MISS,event=0x41,umask=0x80/S 
-e uncore_iio_free_running_3/name=uncore_iio_free_running_3.UNC_IIO_VTD_ACCESS.TLB1_MISS,event=0x41,umask=0x80/S 
-e uncore_iio_free_running_1/name=uncore_iio_free_running_1.UNC_IIO_VTD_ACCESS.TLB1_MISS,event=0x41,umask=0x80/S 
-e uncore_iio_4/name=uncore_iio_4.UNC_IIO_VTD_ACCESS.TLB1_MISS,event=0x41,umask=0x80/S 
-e uncore_iio_2/name=uncore_iio_2.UNC_IIO_VTD_ACCESS.TLB1_MISS,event=0x41,umask=0x80/S 
-e uncore_iio_0/name=uncore_iio_0.UNC_IIO_VTD_ACCESS.TLB1_MISS,event=0x41,umask=0x80/S 
-e uncore_iio_free_running_4/name=uncore_iio_free_running_4.UNC_IIO_VTD_ACCESS.TLB1_MISS,event=0x41,umask=0x80/S 
-e uncore_iio_free_running_2/name=uncore_iio_free_running_2.UNC_IIO_VTD_ACCESS.TLB1_MISS,event=0x41,umask=0x80/S 
-e uncore_iio_5/name=uncore_iio_5.UNC_IIO_VTD_ACCESS.TLB1_MISS,event=0x41,umask=0x80/S 
-e uncore_iio_free_running_0/name=uncore_iio_free_running_0.UNC_IIO_VTD_ACCESS.TLB1_MISS,event=0x41,umask=0x80/S 
-e uncore_iio_3/name=uncore_iio_3.UNC_IIO_VTD_ACCESS.TLB1_MISS,event=0x41,umask=0x80/S 
-e uncore_iio_1/name=uncore_iio_1.UNC_IIO_VTD_ACCESS.TLB1_MISS,event=0x41,umask=0x80/S 
-e uncore_pcu/name=uncore_pcu.UNC_P_FREQ_MAX_LIMIT_THERMAL_CYCLES,event=0x4/S 
-e uncore_ubox/name=uncore_ubox.UNC_U_PHOLD_CYCLES.ASSERT_TO_ACK,event=0x45,umask=0x1/S echo 2 got unknown exit status was: exit status: 129

when running the command from the next step. (FYI: I edited some linebreaks on the -e in to make it more readable)

And when I run the command directly on my machine:

     0.250336972'CPU0'253.38'msec'cpu-clock'253383915'100.00'1.014'CPUs utilized
     0.250336972'CPU1'253.41'msec'cpu-clock'253406764'100.00'1.014'CPUs utilized
     0.250336972'CPU2'253.43'msec'cpu-clock'253433508'100.00'1.014'CPUs utilized
     0.250336972'CPU3'253.47'msec'cpu-clock'253465354'100.00'1.014'CPUs utilized
     0.250336972'CPU4'253.50'msec'cpu-clock'253504899'100.00'1.014'CPUs utilized
     0.250336972'CPU5'253.55'msec'cpu-clock'253550015'100.00'1.014'CPUs utilized
     0.250336972'CPU6'253.57'msec'cpu-clock'253573284'100.00'1.014'CPUs utilized
     0.250336972'CPU7'253.61'msec'cpu-clock'253614609'100.00'1.014'CPUs utilized
     0.250336972'CPU8'253.65'msec'cpu-clock'253646531'100.00'1.015'CPUs utilized
     0.250336972'CPU9'253.69'msec'cpu-clock'253692290'100.00'1.015'CPUs utilized
     0.250336972'CPU10'253.73'msec'cpu-clock'253728630'100.00'1.015'CPUs utilized
     0.250336972'CPU11'253.77'msec'cpu-clock'253771684'100.00'1.015'CPUs utilized
     0.250336972'CPU12'253.83'msec'cpu-clock'253828433'100.00'1.015'CPUs utilized
     0.250336972'CPU13'253.86'msec'cpu-clock'253858548'100.00'1.015'CPUs utilized
     0.250336972'CPU14'253.89'msec'cpu-clock'253894896'100.00'1.016'CPUs utilized
     0.250336972'CPU15'253.93'msec'cpu-clock'253925943'100.00'1.016'CPUs utilized
     0.250336972'CPU16'253.96'msec'cpu-clock'253955767'100.00'1.016'CPUs utilized
     0.250336972'CPU17'253.99'msec'cpu-clock'253993320'100.00'1.016'CPUs utilized
     0.250336972'CPU18'254.02'msec'cpu-clock'254023594'100.00'1.016'CPUs utilized
     0.250336972'CPU19'254.04'msec'cpu-clock'254042196'100.00'1.016'CPUs utilized
     0.250336972'CPU20'254.07'msec'cpu-clock'254068500'100.00'1.016'CPUs utilized
     0.250336972'CPU21'254.11'msec'cpu-clock'254112012'100.00'1.016'CPUs utilized
     0.250336972'CPU22'254.15'msec'cpu-clock'254151445'100.00'1.017'CPUs utilized
     0.250336972'CPU23'254.19'msec'cpu-clock'254192292'100.00'1.017'CPUs utilized
     0.250336972'CPU24'254.23'msec'cpu-clock'254233082'100.00'1.017'CPUs utilized
     0.250336972'CPU25'254.28'msec'cpu-clock'254284899'100.00'1.017'CPUs utilized
     0.250336972'CPU26'254.32'msec'cpu-clock'254318610'100.00'1.017'CPUs utilized
     0.250336972'CPU27'254.38'msec'cpu-clock'254375920'100.00'1.018'CPUs utilized
     0.250336972'CPU28'254.42'msec'cpu-clock'254417661'100.00'1.018'CPUs utilized
     0.250336972'CPU29'254.46'msec'cpu-clock'254460412'100.00'1.018'CPUs utilized
     0.250336972'CPU30'254.51'msec'cpu-clock'254505361'100.00'1.018'CPUs utilized
     0.250336972'CPU31'254.55'msec'cpu-clock'254545200'100.00'1.018'CPUs utilized
     0.250336972'CPU32'254.59'msec'cpu-clock'254588953'100.00'1.018'CPUs utilized
     0.250336972'CPU33'254.63'msec'cpu-clock'254627548'100.00'1.019'CPUs utilized
     0.250336972'CPU34'254.61'msec'cpu-clock'254611511'100.00'1.018'CPUs utilized
     0.250336972'CPU35'254.58'msec'cpu-clock'254580472'100.00'1.018'CPUs utilized
     0.250336972'CPU36'254.55'msec'cpu-clock'254550147'100.00'1.018'CPUs utilized
     0.250336972'CPU37'254.51'msec'cpu-clock'254507023'100.00'1.018'CPUs utilized
     0.250336972'CPU38'254.45'msec'cpu-clock'254454388'100.00'1.018'CPUs utilized
     0.250336972'CPU39'254.41'msec'cpu-clock'254413817'100.00'1.018'CPUs utilized
     0.250336972'CPU0'5''context-switches'253382550'100.00'19.733'/sec
     0.250336972'CPU1'5''context-switches'253405650'100.00'19.731'/sec
     0.250336972'CPU2'3''context-switches'253433022'100.00'11.837'/sec
     0.250336972'CPU3'3''context-switches'253464804'100.00'11.836'/sec
     0.250336972'CPU4'3''context-switches'253504897'100.00'11.834'/sec
     0.250336972'CPU5'5''context-switches'253549701'100.00'19.720'/sec
     0.250336972'CPU6'5''context-switches'253573316'100.00'19.718'/sec
     0.250336972'CPU7'3''context-switches'253614289'100.00'11.829'/sec
     0.250336972'CPU8'3''context-switches'253645915'100.00'11.827'/sec
     0.250336972'CPU9'3''context-switches'253691972'100.00'11.825'/sec
     0.250336972'CPU10'3''context-switches'253728482'100.00'11.824'/sec
     0.250336972'CPU11'3''context-switches'253773216'100.00'11.822'/sec
     0.250336972'CPU12'21''context-switches'253827935'100.00'82.733'/sec
     0.250336972'CPU13'3''context-switches'253857756'100.00'11.818'/sec
     0.250336972'CPU14'3''context-switches'253894388'100.00'11.816'/sec
     0.250336972'CPU15'3''context-switches'253925218'100.00'11.814'/sec
     0.250336972'CPU16'3''context-switches'253955436'100.00'11.813'/sec
     0.250336972'CPU17'3''context-switches'253993041'100.00'11.811'/sec
     0.250336972'CPU18'3''context-switches'254022485'100.00'11.810'/sec
     0.250336972'CPU19'3''context-switches'254040402'100.00'11.809'/sec
     0.250336972'CPU20'5''context-switches'254069457'100.00'19.680'/sec
     0.250336972'CPU21'6''context-switches'254112060'100.00'23.612'/sec
     0.250336972'CPU22'3''context-switches'254151635'100.00'11.804'/sec
     0.250336972'CPU23'3''context-switches'254192801'100.00'11.802'/sec
     0.250336972'CPU24'5''context-switches'254233395'100.00'19.667'/sec
     0.250336972'CPU25'3''context-switches'254285361'100.00'11.798'/sec
     0.250336972'CPU26'3''context-switches'254318919'100.00'11.796'/sec
     0.250336972'CPU27'3''context-switches'254376153'100.00'11.794'/sec
     0.250336972'CPU28'3''context-switches'254418061'100.00'11.792'/sec
     0.250336972'CPU29'5''context-switches'254460933'100.00'19.649'/sec
     0.250336972'CPU30'3''context-switches'254505827'100.00'11.788'/sec
     0.250336972'CPU31'3''context-switches'254545575'100.00'11.786'/sec
     0.250336972'CPU32'3''context-switches'254589255'100.00'11.784'/sec
     0.250336972'CPU33'3''context-switches'254627901'100.00'11.782'/sec
     0.250336972'CPU34'3''context-switches'254610516'100.00'11.783'/sec
     0.250336972'CPU35'3''context-switches'254579341'100.00'11.784'/sec
     0.250336972'CPU36'3''context-switches'254548912'100.00'11.786'/sec
     0.250336972'CPU37'3''context-switches'254505671'100.00'11.788'/sec
     0.250336972'CPU38'5''context-switches'254453648'100.00'19.650'/sec
     0.250336972'CPU39'5''context-switches'254411504'100.00'19.653'/sec
     0.250336972'CPU0'1''cpu-migrations'253381598'100.00'3.947'/sec
     0.250336972'CPU1'1''cpu-migrations'253405027'100.00'3.946'/sec
     0.250336972'CPU2'1''cpu-migrations'253432499'100.00'3.946'/sec
     0.250336972'CPU3'1''cpu-migrations'253464138'100.00'3.945'/sec
     0.250336972'CPU4'1''cpu-migrations'253504696'100.00'3.945'/sec
     0.250336972'CPU5'1''cpu-migrations'253548775'100.00'3.944'/sec
     0.250336972'CPU6'1''cpu-migrations'253572827'100.00'3.944'/sec
     0.250336972'CPU7'1''cpu-migrations'253613564'100.00'3.943'/sec
     0.250336972'CPU8'1''cpu-migrations'253645312'100.00'3.942'/sec
     0.250336972'CPU9'1''cpu-migrations'253691353'100.00'3.942'/sec
     0.250336972'CPU10'1''cpu-migrations'253728320'100.00'3.941'/sec
     0.250336972'CPU11'1''cpu-migrations'253773435'100.00'3.941'/sec
     0.250336972'CPU12'1''cpu-migrations'253826875'100.00'3.940'/sec
     0.250336972'CPU13'1''cpu-migrations'253857244'100.00'3.939'/sec
     0.250336972'CPU14'1''cpu-migrations'253893497'100.00'3.939'/sec
     0.250336972'CPU15'1''cpu-migrations'253924520'100.00'3.938'/sec
     0.250336972'CPU16'1''cpu-migrations'253954425'100.00'3.938'/sec
     0.250336972'CPU17'1''cpu-migrations'253992327'100.00'3.937'/sec
     0.250336972'CPU18'1''cpu-migrations'254021586'100.00'3.937'/sec
     0.250336972'CPU19'1''cpu-migrations'254039223'100.00'3.936'/sec
     0.250336972'CPU20'1''cpu-migrations'254069317'100.00'3.936'/sec
     0.250336972'CPU21'1''cpu-migrations'254111520'100.00'3.935'/sec
     0.250336972'CPU22'1''cpu-migrations'254151421'100.00'3.935'/sec
     0.250336972'CPU23'1''cpu-migrations'254192412'100.00'3.934'/sec
     0.250336972'CPU24'1''cpu-migrations'254233340'100.00'3.933'/sec
     0.250336972'CPU25'1''cpu-migrations'254284769'100.00'3.933'/sec
     0.250336972'CPU26'1''cpu-migrations'254318707'100.00'3.932'/sec
     0.250336972'CPU27'1''cpu-migrations'254375347'100.00'3.931'/sec
     0.250336972'CPU28'1''cpu-migrations'254417587'100.00'3.931'/sec
     0.250336972'CPU29'1''cpu-migrations'254460418'100.00'3.930'/sec
     0.250336972'CPU30'1''cpu-migrations'254505333'100.00'3.929'/sec
     0.250336972'CPU31'1''cpu-migrations'254545365'100.00'3.929'/sec
     0.250336972'CPU32'1''cpu-migrations'254588844'100.00'3.928'/sec
     0.250336972'CPU33'1''cpu-migrations'254627424'100.00'3.927'/sec
     0.250336972'CPU34'1''cpu-migrations'254609951'100.00'3.928'/sec
     0.250336972'CPU35'1''cpu-migrations'254578898'100.00'3.928'/sec
     0.250336972'CPU36'1''cpu-migrations'254548252'100.00'3.929'/sec
     0.250336972'CPU37'1''cpu-migrations'254505084'100.00'3.929'/sec
     0.250336972'CPU38'1''cpu-migrations'254453223'100.00'3.930'/sec
     0.250336972'CPU39'1''cpu-migrations'254410081'100.00'3.931'/sec
     0.250336972'CPU0'0''page-faults'253380625'100.00'0.000'/sec
     0.250336972'CPU1'0''page-faults'253404508'100.00'0.000'/sec
     0.250336972'CPU2'0''page-faults'253431891'100.00'0.000'/sec
     0.250336972'CPU3'0''page-faults'253463551'100.00'0.000'/sec
     0.250336972'CPU4'0''page-faults'253504682'100.00'0.000'/sec
     0.250336972'CPU5'0''page-faults'253547845'100.00'0.000'/sec
     0.250336972'CPU6'0''page-faults'253572565'100.00'0.000'/sec
     0.250336972'CPU7'0''page-faults'253613055'100.00'0.000'/sec
     0.250336972'CPU8'0''page-faults'253644856'100.00'0.000'/sec
     0.250336972'CPU9'0''page-faults'253690850'100.00'0.000'/sec
     0.250336972'CPU10'0''page-faults'253727835'100.00'0.000'/sec
     0.250336972'CPU11'0''page-faults'253773613'100.00'0.000'/sec
     0.250336972'CPU12'0''page-faults'253825949'100.00'0.000'/sec
     0.250336972'CPU13'0''page-faults'253856146'100.00'0.000'/sec
     0.250336972'CPU14'0''page-faults'253893073'100.00'0.000'/sec
     0.250336972'CPU15'0''page-faults'253924071'100.00'0.000'/sec
     0.250336972'CPU16'0''page-faults'253953728'100.00'0.000'/sec
     0.250336972'CPU17'0''page-faults'253991608'100.00'0.000'/sec
     0.250336972'CPU18'0''page-faults'254020992'100.00'0.000'/sec
     0.250336972'CPU19'0''page-faults'254037872'100.00'0.000'/sec
     0.250336972'CPU20'0''page-faults'254069309'100.00'0.000'/sec
     0.250336972'CPU21'0''page-faults'254111180'100.00'0.000'/sec
     0.250336972'CPU22'0''page-faults'254150946'100.00'0.000'/sec
     0.250336972'CPU23'0''page-faults'254192175'100.00'0.000'/sec
     0.250336972'CPU24'0''page-faults'254233219'100.00'0.000'/sec
     0.250336972'CPU25'0''page-faults'254284327'100.00'0.000'/sec
     0.250336972'CPU26'0''page-faults'254318643'100.00'0.000'/sec
     0.250336972'CPU27'0''page-faults'254374950'100.00'0.000'/sec
     0.250336972'CPU28'0''page-faults'254416897'100.00'0.000'/sec
     0.250336972'CPU29'0''page-faults'254459858'100.00'0.000'/sec
     0.250336972'CPU30'0''page-faults'254504863'100.00'0.000'/sec
     0.250336972'CPU31'0''page-faults'254545125'100.00'0.000'/sec
     0.250336972'CPU32'0''page-faults'254588396'100.00'0.000'/sec
     0.250336972'CPU33'0''page-faults'254626763'100.00'0.000'/sec
     0.250336972'CPU34'0''page-faults'254609295'100.00'0.000'/sec
     0.250336972'CPU35'0''page-faults'254578272'100.00'0.000'/sec
     0.250336972'CPU36'0''page-faults'254547890'100.00'0.000'/sec
     0.250336972'CPU37'0''page-faults'254504449'100.00'0.000'/sec
     0.250336972'CPU38'0''page-faults'254452645'100.00'0.000'/sec
     0.250336972'CPU39'0''page-faults'254408704'100.00'0.000'/sec
     0.250336972'CPU0'322896''cycles'253373543'100.00'0.001'GHz
     0.250336972'CPU1'552890''cycles'253397356'100.00'0.002'GHz
     0.250336972'CPU2'157846''cycles'253424729'100.00'0.001'GHz
     0.250336972'CPU3'135269''cycles'253456979'100.00'0.001'GHz
     0.250336972'CPU4'265940''cycles'253501712'100.00'0.001'GHz
     0.250336972'CPU5'133968''cycles'253540769'100.00'0.001'GHz
     0.250336972'CPU6'196795''cycles'253569355'100.00'0.001'GHz
     0.250336972'CPU7'88479''cycles'253606044'100.00'0.000'GHz
     0.250336972'CPU8'90582''cycles'253637839'100.00'0.000'GHz
     0.250336972'CPU9'126860''cycles'253683537'100.00'0.001'GHz
     0.250336972'CPU10'102766''cycles'253721075'100.00'0.000'GHz
     0.250336972'CPU11'360853''cycles'253770386'100.00'0.001'GHz
     0.250336972'CPU12'708656''cycles'253818979'100.00'0.003'GHz
     0.250336972'CPU13'89732''cycles'253848558'100.00'0.000'GHz
     0.250336972'CPU14'103195''cycles'253886182'100.00'0.000'GHz
     0.250336972'CPU15'102310''cycles'253917140'100.00'0.000'GHz
     0.250336972'CPU16'100509''cycles'253946409'100.00'0.000'GHz
     0.250336972'CPU17'101410''cycles'253984249'100.00'0.000'GHz
     0.250336972'CPU18'124465''cycles'254013776'100.00'0.000'GHz
     0.250336972'CPU19'150702''cycles'254030304'100.00'0.001'GHz
     0.250336972'CPU20'177209''cycles'254063043'100.00'0.001'GHz
     0.250336972'CPU21'206835''cycles'254104920'100.00'0.001'GHz
     0.250336972'CPU22'277082''cycles'254144553'100.00'0.001'GHz
     0.250336972'CPU23'102421''cycles'254185694'100.00'0.000'GHz
     0.250336972'CPU24'550177''cycles'254230657'100.00'0.002'GHz
     0.250336972'CPU25'94379''cycles'254278016'100.00'0.000'GHz
     0.250336972'CPU26'193472''cycles'254315642'100.00'0.001'GHz
     0.250336972'CPU27'105738''cycles'254368667'100.00'0.000'GHz
     0.250336972'CPU28'100077''cycles'254410528'100.00'0.000'GHz
     0.250336972'CPU29'274433''cycles'254453379'100.00'0.001'GHz
     0.250336972'CPU30'95374''cycles'254498076'100.00'0.000'GHz
     0.250336972'CPU31'721243''cycles'254538713'100.00'0.003'GHz
     0.250336972'CPU32'169377''cycles'254579319'100.00'0.001'GHz
     0.250336972'CPU33'93736''cycles'254620589'100.00'0.000'GHz
     0.250336972'CPU34'111501''cycles'254602156'100.00'0.000'GHz
     0.250336972'CPU35'110685''cycles'254570739'100.00'0.000'GHz
     0.250336972'CPU36'108921''cycles'254540372'100.00'0.000'GHz
     0.250336972'CPU37'108387''cycles'254496890'100.00'0.000'GHz
     0.250336972'CPU38'181390''cycles'254445145'100.00'0.001'GHz
     0.250336972'CPU39'267655''cycles'254400748'100.00'0.001'GHz

And when I only run a part of the command:

perf stat -aA -I 250 -x ';' -o out/1_stat.csv \
-e uncore_iio_free_running_5/name=uncore_iio_free_running_5.UNC_IIO_TXN_REQ_BY_CPU.CFG_READ.PART1,event=0xc1,umask=0x40,fc_mask=0x7
echo 2
event syntax error: '..x40,fc_mask=0x7'
                                  \___ parser error
Run 'perf list' for a list of valid events

 Usage: perf stat [<options>] [<command>]

    -e, --event <event>   event selector. use 'perf list' to list available events
2

To Reproduce Steps to reproduce the behavior:

  1. Run autoperf with ./target/release/autoperf profile echo 2 on a cascade lake cpu

Machine (please complete the following information):

  • Linux version: Linux 5.15.0-25-generic #25-Ubuntu SMP Wed Mar 30 15:54:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
  • Machine: cpuidout.txt
  • perf version: perf version 5.15.30
  • autoperf version: 0.2 but my fork where I updated two or three dependencies to compile it and fixed one or two things.
  • output of:
breakpoint   intel_bts	power	      uncore_cha_1  uncore_cha_5  uncore_cha_9	uncore_iio_3		   uncore_iio_free_running_1  uncore_iio_free_running_5  uncore_imc_3  uncore_irp_1  uncore_irp_5     uncore_m2pcie_1  uncore_ubox
cpu	     intel_pt	software      uncore_cha_2  uncore_cha_6  uncore_iio_0	uncore_iio_4		   uncore_iio_free_running_2  uncore_imc_0		 uncore_imc_4  uncore_irp_2  uncore_m2m_0     uncore_m3upi_0   uncore_upi_0
cstate_core  kprobe	tracepoint    uncore_cha_3  uncore_cha_7  uncore_iio_1	uncore_iio_5		   uncore_iio_free_running_3  uncore_imc_1		 uncore_imc_5  uncore_irp_3  uncore_m2m_1     uncore_m3upi_1   uncore_upi_1
cstate_pkg   msr	uncore_cha_0  uncore_cha_4  uncore_cha_8  uncore_iio_2	uncore_iio_free_running_0  uncore_iio_free_running_4  uncore_imc_2		 uncore_irp_0  uncore_irp_4  uncore_m2pcie_0  uncore_pcu       uprobe

Additional context I forked autoperf to try and make it compileable (and partly useable) again. That worked, I can make a pr if you want to. I would be more than happy to fix these things myself (and make a pr) if you would point me into the right direction.

Also I would be happy to learn how to mention your work/autoperf if I end up using it.

tom-wegener avatar Apr 21 '22 13:04 tom-wegener

Hey, thanks for using the tool! Yes a PR to make it compile again would be most welcome!

regarding the fc_mask parser error : it seems like you're missing a / at the end (when you run the command manually)? e.g., would this work:

perf stat -aA -I 250 -x ';' -o out/1_stat.csv \
-e uncore_iio_free_running_5/name=uncore_iio_free_running_5.UNC_IIO_TXN_REQ_BY_CPU.CFG_READ.PART1,event=0xc1,umask=0x40,fc_mask=0x7/
echo 2

Also I would be happy to learn how to mention your work/autoperf if I end up using it.

Feel free to cite this work: https://dl.acm.org/doi/10.1145/2967360.2967375

gz avatar Apr 21 '22 17:04 gz

Hey, the PR is #5 . I also changed two other things besides making it compile again, but they re not that important, only nice to have.

Sadly the / didn't help, I still have the same error:

perf stat -aA -I 250 -x ';' -o out/1_stat.csv \
-e uncore_iio_free_running_5/name=uncore_iio_free_running_5.UNC_IIO_TXN_REQ_BY_CPU.CFG_READ.PART1,event=0xc1,umask=0x40,fc_mask=0x7/
echo 2
event syntax error: '..umask=0x40,fc_mask=0x7/'
                                  \___ unknown term 'fc_mask' for pmu 'uncore_iio_free_running_5'

valid terms: event,umask,config,config1,config2,name,period,percore

Initial error:
event syntax error: '..umask=0x40,fc_mask=0x7/'
                                  \___ unknown term 'fc_mask' for pmu 'uncore_iio_free_running_5'

valid terms: event,umask,config,config1,config2,name,period,percore
Run 'perf list' for a list of valid events

 Usage: perf stat [<options>] [<command>]

    -e, --event <event>   event selector. use 'perf list' to list available events
2

Could it have something to do with the counters.toml? And how could I add my architecture to it?

tom-wegener avatar Apr 22 '22 07:04 tom-wegener

Hi, so it looks like perf does not support the fc_mask attribute for this perf event.

fc_mask is added here: https://github.com/gz/autoperf/blob/5675fe88ba9621e22ac8b2c3decb042de5c91952/src/profile.rs#L561

and it's ultimately coming from the x86 crate which parses the official Intel CSVs to get this information: e.g., the fc attribute here https://github.com/gz/rust-x86/blob/523296c4ecfad57ef26aabdde1446c788dbc668e/x86data/perfmon_data/CLX/cascadelakex_uncore_v1.04.json#L2679

I didn't actually end up finding the UNC_IIO_TXN_REQ_BY_CPU.CFG_READ.PART1 event in the cascade lake file, so not sure how this event is generated by the tool? Did you end up updating the performance counters info of the x86 crate?

gz avatar Apr 25 '22 16:04 gz

Hey, I tried updating the files and still have some smaller problems with that. Mostly with the parsing because some values changed a lot or on a weird way (counter over 8, filter values missing etc) and I'm not that sure about safe defaults and safe value ranges but I'll try to make it work and I maybe have to change the parsing-part a bit.

tom-wegener avatar Apr 25 '22 18:04 tom-wegener

I see.. About the counters.toml, yes you'd probably need to add an entry for cascade lake, you can follow the same style as skylake: https://github.com/gz/autoperf/blob/master/src/counters.toml#L43

It should look something like this:

[cascadelake]
family = X # Model number taken from CPUID
models = Y # Model number taken from CPUID taken
# Finally these values can be found by counting the number of units in
#  /sys/bus/event_source/devices/
# e.g., based on your issue it looks like you'll have 5 IMC units so IMC = 5
fixed_counters = { CPU = A, UBO = B, IMC = C } # 
programmable_counters = { CPU = 4, UPI = 4, CHA = 4, IIO = 4, IMC = 4, IRP = 4, M2M = 4, M3UPI = 4, PCU = 4, UBO = 2, CBO = 4 }

It might be that there are some new units in cascade lake that aren't supported yet. Then it would print a warning (https://github.com/gz/autoperf/blob/5675fe88ba9621e22ac8b2c3decb042de5c91952/src/profile.rs#L77), let me know if this is the case then we can work towards supporting them.

gz avatar Apr 25 '22 18:04 gz

btw. one simple fix to make progress for now is to just add this event to the ignore list here: https://github.com/gz/autoperf/blob/master/src/profile.rs#L113

gz avatar Apr 25 '22 23:04 gz

Hey, thank you a lot. Sadly it does not work. I still get some errors bacause of some buggy parsing (I think parts of that are on my side). I decided to use pcm for the moment because I don't necessarily need all the information in that density. And from my point of view I would have to put a lot more work into it to make it work on at least one of the machines I have to test it.

For the cascade-lake integration: In the newest mapfile.csv from Intel they use GenuineIntel-6-55-[01234] for skylakex and GenuineIntel-6-55-[56789ABCDEF] for cascade-lake instead of the old GenuineIntel-6-55 only for skylakex which makes this a bit more complicated than I thought.

But since I started updating parts of this (rust-x86, rust-perfcnt etc), I'll try to maybe finish some work in my free time and I hope maybe you'll see one or two PRs. (especially the db-update for rust-x86 is a thing I would like to finish)

tom-wegener avatar Apr 27 '22 15:04 tom-wegener