skiboot icon indicating copy to clipboard operation
skiboot copied to clipboard

EEH report when Network card iperf test

Open lili-lilili opened this issue 1 year ago • 1 comments

When we do the netcard iperf test(on P10), the EEH error always report.

Here is the register dump from OPAL. It looks like the PCIe link get some link problems. "phbRegbErrorStatus" register records a link down error.

I also had problems with parsing the registers. I only have a pcie spec for P9, bu it has no explanation for "phbRegbErrorStatus bit31".

Any suggestions? @fbarrat

[ 1005.489389250,3] PHB#006f[6:3]: PHB Freeze/Fence detected ! [ 1005.489438608,3] PHB#006f[6:3]: PCI FIR=0000000000000000 [ 1005.489474501,3] PHB#006f[6:3]: PCI FIR WOF=0000000000000000 [ 1005.489510147,3] PHB#006f[6:3]: NEST FIR=0800000000000000 [ 1005.489542000,3] PHB#006f[6:3]: NEST FIR WOF=0800000000000000 [ 1005.489577729,3] PHB#006f[6:3]: ERR RPT0=0010000000000000 [ 1005.489607921,3] PHB#006f[6:3]: ERR RPT1=0000000000000000 [ 1005.489642726,3] PHB#006f[6:3]: AIB ERR=00cc100000000000 [ 1005.490223446,3] PHB#006f[6:3]: brdgCtl = 00000002 [ 1005.490262691,3] PHB#006f[6:3]: deviceStatus = 00000140 [ 1005.490295542,3] PHB#006f[6:3]: slotStatus = 00402000 [ 1005.490335762,3] PHB#006f[6:3]: linkStatus = c1010008 [ 1005.490366070,3] PHB#006f[6:3]: devCmdStatus = 00100107 [ 1005.490397781,3] PHB#006f[6:3]: devSecStatus = 00000000 [ 1005.490428166,3] PHB#006f[6:3]: rootErrorStatus = 00000000 [ 1005.490463922,3] PHB#006f[6:3]: corrErrorStatus = 00000000 [ 1005.490497253,3] PHB#006f[6:3]: uncorrErrorStatus = 00000000 [ 1005.490530530,3] PHB#006f[6:3]: devctl = 00000140 [ 1005.490562582,3] PHB#006f[6:3]: devStat = 00000000 [ 1005.490593271,3] PHB#006f[6:3]: tlpHdr1 = 00000000 [ 1005.490623159,3] PHB#006f[6:3]: tlpHdr2 = 00000000 [ 1005.490654178,3] PHB#006f[6:3]: tlpHdr3 = 00000000 [ 1005.490687329,3] PHB#006f[6:3]: tlpHdr4 = 00000000 [ 1005.490721063,3] PHB#006f[6:3]: sourceId = 00000000 [ 1005.490751859,3] PHB#006f[6:3]: nFir = 0800000000000000 [ 1005.490786389,3] PHB#006f[6:3]: nFirMask = 003001d000000000 [ 1005.490820066,3] PHB#006f[6:3]: nFirWOF = 0800000000000000 [ 1005.490853079,3] PHB#006f[6:3]: phbPlssr = 0000001c00000000 [ 1005.490888389,3] PHB#006f[6:3]: phbCsr = 0000001c00000000 [ 1005.490926839,3] PHB#006f[6:3]: lemFir = 0000000100000100 [ 1005.490964803,3] PHB#006f[6:3]: lemErrorMask = 0000000000000000 [ 1005.491002854,3] PHB#006f[6:3]: lemWOF = 0000000000000100 [ 1005.491036256,3] PHB#006f[6:3]: phbErrorStatus = 000004e000000000 [ 1005.491069266,3] PHB#006f[6:3]: phbFirstErrorStatus = 0000040000000000 [ 1005.491102274,3] PHB#006f[6:3]: phbErrorLog0 = 0000000000000000 [ 1005.491140034,3] PHB#006f[6:3]: phbErrorLog1 = 0000000000000000 [ 1005.491177123,3] PHB#006f[6:3]: phbTxeErrorStatus = 0000000000000000 [ 1005.491213447,3] PHB#006f[6:3]: phbTxeFirstErrorStatus = 0000000000000000 [ 1005.491248051,3] PHB#006f[6:3]: phbTxeErrorLog0 = 0000000000000000 [ 1005.491281223,3] PHB#006f[6:3]: phbTxeErrorLog1 = 0000000000000000 [ 1005.491314903,3] PHB#006f[6:3]: phbRxeArbErrorStatus = 0000000000000000 [ 1005.491349975,3] PHB#006f[6:3]: phbRxeArbFrstErrorStatus = 0000000000000000 [ 1005.491388215,3] PHB#006f[6:3]: phbRxeArbErrorLog0 = 0000000000000000 [ 1005.491425458,3] PHB#006f[6:3]: phbRxeArbErrorLog1 = 0000000000000000 [ 1005.491459174,3] PHB#006f[6:3]: phbRxeMrgErrorStatus = 0000000000000001 [ 1005.491492464,3] PHB#006f[6:3]: phbRxeMrgFrstErrorStatus = 0000000000000001 [ 1005.491524353,3] PHB#006f[6:3]: phbRxeMrgErrorLog0 = 0000000000000000 [ 1005.491557762,3] PHB#006f[6:3]: phbRxeMrgErrorLog1 = 0000000000000000 [ 1005.491594088,3] PHB#006f[6:3]: phbRxeTceErrorStatus = 0000000000000000 [ 1005.491632865,3] PHB#006f[6:3]: phbRxeTceFrstErrorStatus = 0000000000000000 [ 1005.491671010,3] PHB#006f[6:3]: phbRxeTceErrorLog0 = 0000000000000000 [ 1005.491705600,3] PHB#006f[6:3]: phbRxeTceErrorLog1 = 0000000000000000 [ 1005.491738612,3] PHB#006f[6:3]: phbPblErrorStatus = 0100000000000000 [ 1005.491772417,3] PHB#006f[6:3]: phbPblFirstErrorStatus = 0100000000000000 [ 1005.491810821,3] PHB#006f[6:3]: phbPblErrorLog0 = 0000000000000000 [ 1005.491846866,3] PHB#006f[6:3]: phbPblErrorLog1 = 0000000000000000 [ 1005.491884228,3] PHB#006f[6:3]: phbPcieDlpErrorLog1 = 0000000000000000 [ 1005.491921107,3] PHB#006f[6:3]: phbPcieDlpErrorLog2 = 0000000000000000 [ 1005.491954581,3] PHB#006f[6:3]: phbPcieDlpErrorStatus = 0000000000000000 [ 1005.491988376,3] PHB#006f[6:3]: phbRegbErrorStatus = 0090001100000000 [ 1005.492022294,3] PHB#006f[6:3]: phbRegbFirstErrorStatus = 0000001000000000 [ 1005.492058404,3] PHB#006f[6:3]: phbRegbErrorLog0 = 2480005800000000 [ 1005.492094757,3] PHB#006f[6:3]: phbRegbErrorLog1 = 0000000000000000

lili-lilili avatar Aug 11 '23 09:08 lili-lilili