IPC4 failures with cavs reference firmware on WHL_UPEXT
Intel internal tests show failures for ?model=WHL_UPEXT_HDA_IPC4&testcase=multiple-pause-resume-50 FW reported error: 6 - Unknown error while processing the request of ipc4 set pipeline 10 state 2
the last working test was 12028 2022-04-24 12111 12176 12202 12215 all fail in the same way
EDIT by @XiaoyunWu6666 : in testresult 12202 and 12215 the error happened when check-pause-resume-playback-100 , before multiple-pause-resume
[ 2039.198040] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc4 set pipeline 10 state 2
[ 2039.198051] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx : 0x130a0002|0x0: GLB_SET_PIPELINE_STATE
[ 2039.198209] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx reply: 0x33000006|0xa: GLB_SET_PIPELINE_STATE
[ 2039.198223] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: FW reported error: 6 - Unknown error while processing the request
[ 2039.198286] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc error for msg 0x130a0002|0x0
[ 2039.198305] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ASoC: error at soc_dai_trigger on iDisp3 Pin: -22
[ 2039.198323] kernel: HDMI3: ASoC: dpcm_be_dai_trigger() failed at iDisp3 (-22)
[ 2039.198338] kernel: HDMI3: ASoC: trigger FE cmd: 0 failed: -22
reproduction rate not 100 % how to reproduce
TPLG=/lib/firmware/intel/avs-tplg/cavs-mixin-mixout-hda.tplg MODEL=WHL_UPEXT_HDA_IPC4 ~/sof-test/test-case/check-pause-resume.sh -c 100 -m playback
OR
TPLG=/lib/firmware/intel/avs-tplg/cavs-mixin-mixout-hda.tplg MODEL=WHL_UPEXT_HDA_IPC4 ~/sof-test/test-case/multiple-pause-resume.sh -r 50
Recipe Kernel Branch: topic/sof-dev Kernel Commit: 23306d2cb554
TPLG: cavs-mixin-mixout-hda.tplg

@XiaoyunWu6666 @keqiaozhang @ranj063 @bardliao FYI
@plbossart It looks like the issue has been fixed. The log of 12111 and 12176 are the same as this issue. But the log of 12202 and 12215 is
[ 1704.375888] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ASoC: error at snd_soc_pcm_component_pm_runtime_get on 0000:00:1f.3: -22
[ 1704.375900] kernel: iDisp1: __soc_pcm_open() failed (-22)
[ 1704.375904] kernel: HDMI1: ASoC: dpcm_be_dai_startup() failed at iDisp1 (-22)
[ 1704.375908] kernel: HDMI1: dpcm_fe_dai_startup() failed (-22)
[ 1704.380030] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ASoC: error at snd_soc_pcm_component_pm_runtime_get on 0000:00:1f.3: -22
[ 1704.380037] kernel: iDisp2: __soc_pcm_open() failed (-22)
[ 1704.380040] kernel: HDMI2: ASoC: dpcm_be_dai_startup() failed at iDisp2 (-22)
[ 1704.380043] kernel: HDMI2: dpcm_fe_dai_startup() failed (-22)
And 12222 is PASS. 12111 and 12176 uses the same kernel commit 95ec8d32e6b6, and the newer test uses 23306d2cb554. I can reproduce the issue with commit 95ec8d32e6b6 on my TGL Up board, but when I tested with 23306d2cb554 which is the latest commit, the test result is PASS I tried 3 iterations, and all iterations are PASS.
@bardliao I agree that the test passed, but I don't see any significant differences between kernel 23306d2cb5546598c59688a7cb14eef3df832786 and 95ec8d32e6b6355514bbe267f5bef4abb6c8b4b3, and clearly nothing that could explain why things are broken on pause_push/pause_release tests.
Very odd.
I will check whether it exists now and its reproduction rate EDIT: This issue still exists in result 12257 , reproduction rate is not that high
@XiaoyunWu6666 you've started at least 10 tests yesterday with the same name but different devices and tests.
Can you please provide a clearer explanation of the number of tests run on WHL_UPEXT and the statistics. "Not that high" isn't a very useful assessment, thank you.
model WHL_UPEXT_HDA_IPC4
testcase with results
- multipe-pause-resume-25 : 1 FAIL among 4
- check-pause-resume-playback : 1 FAIL among 4 The number of tests I made yesterday was not large enough to provide statistical probability . I think the only thing they proved was that the issue still existed and could be reproduced by both testcases.
I will try manual tests on DUT to get some more reliable statistics
happened again in inner daily testresult 12356 on model WHL_UPEXT_HDA_IPC4 when multiple-pause-resume-50
reproduction rate 3/8 in daily IPC4 tests in May test
FW reported error: 6 - Unknown error while processing the request of free pipeline widget pipeline.4 in 12444(May 10) FW reported error: 6 - Unknown error while processing the request of ipc4 set pipeline 8 state 2 in 12563 12593(May 13 and 14) FW reported error: 6 - Unknown error while processing the request of ipc4 set pipeline 7 state 3 on WHL UPEXT in inner daily IPC4 12637 (May 17) FW reported error: 6 - Unknown error while processing the request of freeing pipeline widget pipeline.4 on WHL UPEXT in inner daily IPC4 12668(May 18)
Statistics Linux#3626 IPC4 failures with closed-source firmware on WHL_UPEXT
7 hits within 17 days totally 3 hits within past 7 days So the reproduction rate is approximately 50 percent @plbossart
@RanderWang @lgirdwood , just FYI , I'm not sure whether the multiple ipc time out when setting pipeline on sof Zephyr IPC4 platforms are related to this or not .
But ipc time out when setting pipeline does happen on IPC4 with cavs fw . However , the scenario of causing setting pipeline status to fail is different. For sof IPC4 , playback for 100 times can reproduce this ipc issue while on WHL and TGL which run cavs fw , ipc issues happen in pause release cases.
Thanks
Since this issue only happens on WHL, not reproducible on TGL or ADL, so change the priority to P2.
Since this issue only happens on WHL, not reproducible on TGL or ADL, so change the priority to P2.
Not a SOF FW issue.
reopened to see if this still happens with cSOF
Is this still relevant?