linux icon indicating copy to clipboard operation
linux copied to clipboard

IPC4 failures with cavs reference firmware on WHL_UPEXT

Open plbossart opened this issue 3 years ago • 14 comments

Intel internal tests show failures for ?model=WHL_UPEXT_HDA_IPC4&testcase=multiple-pause-resume-50 FW reported error: 6 - Unknown error while processing the request of ipc4 set pipeline 10 state 2

the last working test was 12028 2022-04-24 12111 12176 12202 12215 all fail in the same way

EDIT by @XiaoyunWu6666 : in testresult 12202 and 12215 the error happened when check-pause-resume-playback-100 , before multiple-pause-resume

[ 2039.198040] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc4 set pipeline 10 state 2
[ 2039.198051] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx      : 0x130a0002|0x0: GLB_SET_PIPELINE_STATE
[ 2039.198209] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx reply: 0x33000006|0xa: GLB_SET_PIPELINE_STATE
[ 2039.198223] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: FW reported error: 6 - Unknown error while processing the request
[ 2039.198286] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc error for msg 0x130a0002|0x0
[ 2039.198305] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ASoC: error at soc_dai_trigger on iDisp3 Pin: -22
[ 2039.198323] kernel:  HDMI3: ASoC: dpcm_be_dai_trigger() failed at iDisp3 (-22)
[ 2039.198338] kernel:  HDMI3: ASoC: trigger FE cmd: 0 failed: -22

reproduction rate not 100 % how to reproduce

TPLG=/lib/firmware/intel/avs-tplg/cavs-mixin-mixout-hda.tplg MODEL=WHL_UPEXT_HDA_IPC4 ~/sof-test/test-case/check-pause-resume.sh -c 100 -m playback
OR
TPLG=/lib/firmware/intel/avs-tplg/cavs-mixin-mixout-hda.tplg MODEL=WHL_UPEXT_HDA_IPC4 ~/sof-test/test-case/multiple-pause-resume.sh -r 50

Recipe Kernel Branch: topic/sof-dev Kernel Commit: 23306d2cb554

TPLG: cavs-mixin-mixout-hda.tplg image

plbossart avatar May 02 '22 16:05 plbossart

@XiaoyunWu6666 @keqiaozhang @ranj063 @bardliao FYI

plbossart avatar May 02 '22 16:05 plbossart

@plbossart It looks like the issue has been fixed. The log of 12111 and 12176 are the same as this issue. But the log of 12202 and 12215 is

[ 1704.375888] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ASoC: error at snd_soc_pcm_component_pm_runtime_get on 0000:00:1f.3: -22
[ 1704.375900] kernel:  iDisp1: __soc_pcm_open() failed (-22)
[ 1704.375904] kernel:  HDMI1: ASoC: dpcm_be_dai_startup() failed at iDisp1 (-22)
[ 1704.375908] kernel:  HDMI1: dpcm_fe_dai_startup() failed (-22)
[ 1704.380030] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ASoC: error at snd_soc_pcm_component_pm_runtime_get on 0000:00:1f.3: -22
[ 1704.380037] kernel:  iDisp2: __soc_pcm_open() failed (-22)
[ 1704.380040] kernel:  HDMI2: ASoC: dpcm_be_dai_startup() failed at iDisp2 (-22)
[ 1704.380043] kernel:  HDMI2: dpcm_fe_dai_startup() failed (-22)

And 12222 is PASS. 12111 and 12176 uses the same kernel commit 95ec8d32e6b6, and the newer test uses 23306d2cb554. I can reproduce the issue with commit 95ec8d32e6b6 on my TGL Up board, but when I tested with 23306d2cb554 which is the latest commit, the test result is PASS I tried 3 iterations, and all iterations are PASS.

bardliao avatar May 03 '22 09:05 bardliao

@bardliao I agree that the test passed, but I don't see any significant differences between kernel 23306d2cb5546598c59688a7cb14eef3df832786 and 95ec8d32e6b6355514bbe267f5bef4abb6c8b4b3, and clearly nothing that could explain why things are broken on pause_push/pause_release tests.

Very odd.

plbossart avatar May 03 '22 13:05 plbossart

I will check whether it exists now and its reproduction rate EDIT: This issue still exists in result 12257 , reproduction rate is not that high

XiaoyunWu6666 avatar May 04 '22 03:05 XiaoyunWu6666

@XiaoyunWu6666 you've started at least 10 tests yesterday with the same name but different devices and tests.

Can you please provide a clearer explanation of the number of tests run on WHL_UPEXT and the statistics. "Not that high" isn't a very useful assessment, thank you.

plbossart avatar May 04 '22 14:05 plbossart

model WHL_UPEXT_HDA_IPC4

testcase with results

  • multipe-pause-resume-25 : 1 FAIL among 4
  • check-pause-resume-playback : 1 FAIL among 4 The number of tests I made yesterday was not large enough to provide statistical probability . I think the only thing they proved was that the issue still existed and could be reproduced by both testcases.

I will try manual tests on DUT to get some more reliable statistics

XiaoyunWu6666 avatar May 05 '22 06:05 XiaoyunWu6666

happened again in inner daily testresult 12356 on model WHL_UPEXT_HDA_IPC4 when multiple-pause-resume-50

XiaoyunWu6666 avatar May 06 '22 10:05 XiaoyunWu6666

reproduction rate 3/8 in daily IPC4 tests in May test

XiaoyunWu6666 avatar May 09 '22 09:05 XiaoyunWu6666

FW reported error: 6 - Unknown error while processing the request of free pipeline widget pipeline.4 in 12444(May 10) FW reported error: 6 - Unknown error while processing the request of ipc4 set pipeline 8 state 2 in 12563 12593(May 13 and 14) FW reported error: 6 - Unknown error while processing the request of ipc4 set pipeline 7 state 3 on WHL UPEXT in inner daily IPC4 12637 (May 17) FW reported error: 6 - Unknown error while processing the request of freeing pipeline widget pipeline.4 on WHL UPEXT in inner daily IPC4 12668(May 18)

Statistics Linux#3626 IPC4 failures with closed-source firmware on WHL_UPEXT

7 hits within 17 days totally 3 hits within past 7 days So the reproduction rate is approximately 50 percent @plbossart

XiaoyunWu6666 avatar May 17 '22 01:05 XiaoyunWu6666

@RanderWang @lgirdwood , just FYI , I'm not sure whether the multiple ipc time out when setting pipeline on sof Zephyr IPC4 platforms are related to this or not .

But ipc time out when setting pipeline does happen on IPC4 with cavs fw . However , the scenario of causing setting pipeline status to fail is different. For sof IPC4 , playback for 100 times can reproduce this ipc issue while on WHL and TGL which run cavs fw , ipc issues happen in pause release cases.

XiaoyunWu6666 avatar May 26 '22 04:05 XiaoyunWu6666

Thanks

RanderWang avatar May 26 '22 07:05 RanderWang

Since this issue only happens on WHL, not reproducible on TGL or ADL, so change the priority to P2.

keqiaozhang avatar Jun 06 '22 02:06 keqiaozhang

Since this issue only happens on WHL, not reproducible on TGL or ADL, so change the priority to P2.

Not a SOF FW issue.

lgirdwood avatar Aug 10 '22 08:08 lgirdwood

reopened to see if this still happens with cSOF

plbossart avatar Aug 26 '22 09:08 plbossart

Is this still relevant?

marc-hb avatar May 20 '24 18:05 marc-hb