linux icon indicating copy to clipboard operation
linux copied to clipboard

[BUG] FW reported error: 104 - Other failure of module instance initialization request

Open fredoh9 opened this issue 2 years ago • 44 comments

Describe the bug During check-suspend-resume-with-playback-5.sh test, FW reported error: 104 is occurred.

[ 4410.777465] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx reply: 0x60000068|0x10036: MOD_INIT_INSTANCE
[ 4410.777468] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: FW reported error: 104 - Other failure of module instance initialization request
[ 4410.777519] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc error for msg 0x40010004|0x10036
[ 4410.777548] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: failed to create module alh-copier.SDW0-Playback.0

Intel internal daily test link:

  • planresultdetail/30296?model=MTLP_RVP_SDW&testcase=check-suspend-resume-with-playback-5
  • planresultdetail/30296?model=MTLP_SDW_AIOC&testcase=check-suspend-resume-with-playback-5

To Reproduce TPLG=/lib/firmware/intel/sof-ace-tplg/sof-mtl-rt711-4ch.tplg MODEL=MTLP_RVP_SDW SOF_TEST_INTERVAL=5 ~/sof-test/test-case/check-suspend-resume-with-audio.sh -l 5 -m playback

Reproduction Rate Starting from today. It happened in MTLP_RVP_SDW and MTLP_SDW_AIOC.

Environment

Screenshots or console output

fredoh9 avatar Aug 17 '23 17:08 fredoh9

Found in ADLP_SKU0B00_SDCA also from today's daily test.

Intel internal daily test link: planresultdetail/30853?model=ADLP_SKU0B00_SDCA-cavs2_5-ipc4&testcase=check-suspend-resume-with-playback-5

fredoh9 avatar Aug 29 '23 17:08 fredoh9

Not reproduced for 3 days

fredoh9 avatar Sep 01 '23 18:09 fredoh9

This issue happened again: planresultdetail/31237?model=MTLP_SDW_AIOC-ace1_0-ipc4&testcase=check-suspend-resume-with-playback-5 planresultdetail/31270?model=MTLP_RVP_SDW-ace1_0-ipc4&testcase=check-signal-stop-start-playback-50

keqiaozhang avatar Sep 05 '23 09:09 keqiaozhang

happen again

[ 1445.894941] kernel: snd_sof:sof_ipc4_widget_setup: sof-audio-pci-intel-mtl 0000:00:1f.3: Create widget alh-copier.SDW0-Playback.0 instance 1 - pipe 1 - core 0
[ 1445.894944] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx      : 0x40010004|0x10036: MOD_INIT_INSTANCE [data size: 216]
[ 1445.896264] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx reply: 0x60000068|0x10036: MOD_INIT_INSTANCE
[ 1445.896267] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: FW reported error: 104 - Other failure of module instance initialization request
[ 1445.896280] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc error for msg 0x40010004|0x10036
[ 1445.896287] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: failed to create module alh-copier.SDW0-Playback.0
[ 1445.896293] kernel: snd_sof:sof_widget_free_unlocked: sof-audio-pci-intel-mtl 0000:00:1f.3: widget gain.1.1 freed
[ 1445.896297] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx      : 0x12010000|0x0: GLB_DELETE_PIPELINE
[ 1445.896673] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx reply: 0x32000000|0x0: GLB_DELETE_PIPELINE
[ 1445.896683] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx done : 0x12010000|0x0: GLB_DELETE_PIPELINE
[ 1445.896686] kernel: snd_sof:sof_widget_free_unlocked: sof-audio-pci-intel-mtl 0000:00:1f.3: widget pipeline.1 freed
[ 1445.896688] kernel: snd_sof:sof_widget_free_unlocked: sof-audio-pci-intel-mtl 0000:00:1f.3: widget mixout.1.1 freed
[ 1445.896691] kernel: snd_sof:sof_widget_free_unlocked: sof-audio-pci-intel-mtl 0000:00:1f.3: widget mixin.15.1 freed
[ 1445.896693] kernel: snd_sof:sof_widget_free_unlocked: sof-audio-pci-intel-mtl 0000:00:1f.3: widget gain.15.1 freed
[ 1445.896696] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx      : 0x12000000|0x0: GLB_DELETE_PIPELINE
[ 1445.897189] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx reply: 0x32000000|0x0: GLB_DELETE_PIPELINE
[ 1445.897198] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx done : 0x12000000|0x0: GLB_DELETE_PIPELINE
[ 1445.897201] kernel: snd_sof:sof_widget_free_unlocked: sof-audio-pci-intel-mtl 0000:00:1f.3: widget pipeline.15 freed
[ 1445.897203] kernel: snd_sof:sof_widget_free_unlocked: sof-audio-pci-intel-mtl 0000:00:1f.3: widget host-copier.31.playback freed
[ 1445.897205] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: Failed to set up connected widgets
[ 1445.897210] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: error: failed widget list set up for pcm 31 dir 0
[ 1445.897214] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ASoC: error at snd_soc_pcm_component_hw_params on 0000:00:1f.3: -22

planresultdetail/31496?model=MTLP_RVP_SDW-ace1_0-ipc4&testcase=check-signal-stop-start-playback-50

fredoh9 avatar Sep 08 '23 15:09 fredoh9

We haven't observed this issue in v2.7 stress test.

mengdonglin avatar Sep 14 '23 01:09 mengdonglin

Tried a few days test but can't reproduce it

RanderWang avatar Sep 25 '23 01:09 RanderWang

@fredoh9 any recent sightings at your end ?

lgirdwood avatar Sep 29 '23 14:09 lgirdwood

I haven't seen the same issue for a week+.

fredoh9 avatar Sep 29 '23 16:09 fredoh9

I haven't seen the same issue for a week+.

ok, pls close if not seen again by end of week.

lgirdwood avatar Oct 02 '23 15:10 lgirdwood

Found today again with MTLP_SDW_AIOC

Intel Internal result: planresultdetail/32587?model=MTLP_SDW_AIOC-ace1_0-ipc4&testcase=check-suspend-resume-with-playback-5

fredoh9 avatar Oct 02 '23 16:10 fredoh9

Also reproduced today Oct 3rd in https://sof-ci.01.org/sofpr/PR8273/build13627/devicetest/index.html?model=MTLP_RVP_SDW&testcase=check-signal-stop-start-playback-10

marc-hb avatar Oct 03 '23 17:10 marc-hb

Reproduced today.

Intel internal daily test result: planresultdetail/34066?model=MTLP_SDW_AIOC&testcase=check-suspend-resume-with-playback-5

fredoh9 avatar Nov 01 '23 18:11 fredoh9

@fredoh9 there's nothing in the mtrace that can be traced to the 104 error code so it's really hard to provide feedback.

plbossart avatar Nov 01 '23 18:11 plbossart

No reproductions in CI. Closing this bug.

keqiaozhang avatar Nov 16 '23 05:11 keqiaozhang

Observed this issue again on MTLP_SDW_AIOC platform. Test ID:34965.

dmesg

[ 6026.369056] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx      : 0x40010004|0x10036: MOD_INIT_INSTANCE [data size: 216]
[ 6026.371202] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx reply: 0x60000068|0x10036: MOD_INIT_INSTANCE
[ 6026.371206] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: FW reported error: 104 - Other failure of module instance initialization request
[ 6026.371223] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc error for msg 0x40010004|0x10036
[ 6026.371226] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: failed to create module alh-copier.SDW0-Playback.0
[ 6026.371231] kernel: snd_sof:sof_widget_free_unlocked: sof-audio-pci-intel-mtl 0000:00:1f.3: widget gain.1.1 freed
[ 6026.371235] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx      : 0x12010000|0x0: GLB_DELETE_PIPELINE
[ 6026.371619] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx reply: 0x32000000|0x0: GLB_DELETE_PIPELINE
[ 6026.371640] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx done : 0x12010000|0x0: GLB_DELETE_PIPELINE
[ 6026.371646] kernel: snd_sof:sof_widget_free_unlocked: sof-audio-pci-intel-mtl 0000:00:1f.3: widget pipeline.1 freed
[ 6026.371650] kernel: snd_sof:sof_widget_free_unlocked: sof-audio-pci-intel-mtl 0000:00:1f.3: widget mixout.1.1 freed
[ 6026.371655] kernel: snd_sof:sof_widget_free_unlocked: sof-audio-pci-intel-mtl 0000:00:1f.3: widget mixin.15.1 freed
[ 6026.371658] kernel: snd_sof:sof_widget_free_unlocked: sof-audio-pci-intel-mtl 0000:00:1f.3: widget gain.15.1 freed
[ 6026.371661] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx      : 0x12000000|0x0: GLB_DELETE_PIPELINE
[ 6026.372035] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx reply: 0x32000000|0x0: GLB_DELETE_PIPELINE
[ 6026.372049] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx done : 0x12000000|0x0: GLB_DELETE_PIPELINE
[ 6026.372051] kernel: snd_sof:sof_widget_free_unlocked: sof-audio-pci-intel-mtl 0000:00:1f.3: widget pipeline.15 freed
[ 6026.372053] kernel: snd_sof:sof_widget_free_unlocked: sof-audio-pci-intel-mtl 0000:00:1f.3: widget host-copier.31.playback freed
[ 6026.372055] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: Failed to set up connected widgets
[ 6026.372059] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: error: failed widget list set up for pcm 31 dir 0
[ 6026.372062] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ASoC: error at snd_soc_pcm_component_hw_params on 0000:00:1f.3: -22

keqiaozhang avatar Nov 22 '23 01:11 keqiaozhang

P2 so not gating SOF2.8, pushing out to v2.9.

kv2019i avatar Dec 07 '23 15:12 kv2019i

all the error happened with dai copier

RanderWang avatar Dec 08 '23 07:12 RanderWang

It happened again today.

Intel internal test link: planresultdetail/35987?model=MTLP_SDW_AIOC&testcase=check-suspend-resume-with-playback-5

fredoh9 avatar Dec 15 '23 18:12 fredoh9

Reproduced on ww51.3 on MTLP_SDW_AIOC, report ID 36158. Failed case: check-signal-stop-start-playback-50

mengdonglin avatar Dec 21 '23 05:12 mengdonglin

Again here: https://sof-ci.01.org/sofpr/PR8180/build1804/devicetest/index.html?model=TGLU_RVP_SDW-ipc4&testcase=check-signal-stop-start-playback-10

marc-hb avatar Jan 12 '24 01:01 marc-hb

It is always sdw dai

RanderWang avatar Jan 12 '24 05:01 RanderWang

Again

  • https://sof-ci.01.org/sofpr/PR8755/build1983/devicetest/index.html?model=TGLU_RVP_SDW-ipc4&testcase=check-signal-stop-start-playback-10
  • https://sof-ci.01.org/sofpr/PR8571/build2022/devicetest/index.html?model=MTLP_SDW_AIOC&testcase=check-suspend-resume-with-playback-5 thesofproject/sof#8571
  • https://sof-ci.01.org/sofpr/PR8750/build2052/devicetest/index.html?model=MTLP_SDW_AIOC&testcase=check-suspend-resume-with-playback-5

marc-hb avatar Jan 18 '24 02:01 marc-hb

Thanks, captured

[  768.707985] <err> dai: dai_get: dai_get: failed to get dai with index -7 type 4
[  768.707996] <err> dai_comp: dai_common_new: comp:1 0x10004 dai_new(): dai_get() failed to create DAI.
[  768.708016] <err> copier: copier_dai_create: comp:1 0x10004 failed to create dai
[  768.708048] <err> copier: copier_init: comp:1 0x10004 unable to create dai
[  768.708081] <err> module_adapter: module_init: comp:1 0x10004 module_init() error -19: module specific init failed, comp id 65540
[  768.708095] <err> module_adapter: module_adapter_new: comp:1 0x10004 module_adapter_new() -19: module initialization failed

RanderWang avatar Jan 18 '24 03:01 RanderWang

@bardliao the alh dai index sent by host kernel is 0, and it is decreased with IPC4_ALH_DAI_INDEX_OFFSET in FW, so got -7, can you help to check it ? or it is for LNL so we don't need to -7 ? Thanks!

#ifdef CONFIG_DMA_INTEL_ADSP_GPDMA
#define IPC4_ALH_DAI_INDEX_OFFSET 7
#else
#define IPC4_ALH_DAI_INDEX_OFFSET 0
#endif

/* copier id = (group id << 4) + codec id + IPC4_ALH_DAI_INDEX_OFFSET
 * dai_index = (group id << 8) + codec id;
 */
#define IPC4_ALH_DAI_INDEX(x) ((((x) & 0xF0) << DAI_NUM_ALH_BI_DIR_LINKS_GROUP) + \
				(((x) & 0xF) - IPC4_ALH_DAI_INDEX_OFFSET))

RanderWang avatar Jan 18 '24 03:01 RanderWang

@RanderWang For LNL, we don't set dai_index. Is it required for FW? @plbossart Any thought?

bardliao avatar Jan 18 '24 05:01 bardliao

@RanderWang For LNL, we don't set dai_index. Is it required for FW? @plbossart Any thought?

I just checked with @RanderWang. The issue is seen on TGL, not LNL. So, need to look deeper into the issue. @plbossart

bardliao avatar Jan 18 '24 05:01 bardliao

@bardliao @RanderWang I have no idea what this index of 7 refers to,

I also don't know why this would be specific to TGL, if there was really an issue it would be seen for MTL as well?

plbossart avatar Jan 18 '24 09:01 plbossart

@bardliao @RanderWang I have no idea what this index of 7 refers to,

I also don't know why this would be specific to TGL, if there was really an issue it would be seen for MTL as well?

It was also reproduced on MTL on above comments like https://github.com/thesofproject/linux/issues/4832

RanderWang avatar Jan 19 '24 01:01 RanderWang

The reason is a little tricky. Headphone and Deepbuffer share the same BE (dai-copier), when Headphone stream is being closed and Deepbuffer is being set up, the error will happen.

The node_id is set up by

[  595.659311] kernel:  sof_ipc4_dai_config+0x4b/0x170 [snd_sof]
[  595.659348] kernel:  hda_dai_config+0x62/0xa0 [snd_sof_intel_hda_common]
[  595.659381] kernel:  sdw_params_stream+0x51/0x70 [snd_sof_intel_hda_generic]
[  595.659394] kernel:  intel_hw_params+0x128/0x250 [soundwire_intel]
[  595.659414] kernel:  snd_soc_dai_hw_params+0x39/0xa0 [snd_soc_core]
[  595.659461] kernel:  __soc_pcm_hw_params+0x553/0x720 [snd_soc_core]
[  595.659549] kernel:  dpcm_be_dai_hw_params+0x270/0x3f0 [snd_soc_core]

		if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_OPEN) &&
		    (be->dpcm[stream].state != SND_SOC_DPCM_STATE_HW_PARAMS) &&
		    (be->dpcm[stream].state != SND_SOC_DPCM_STATE_HW_FREE))
			continue;

[  595.659638] kernel:  dpcm_fe_dai_hw_params+0xd0/0x220 [snd_soc_core]
[  595.659675] kernel:  snd_pcm_hw_params+0x3ba/0x6a0 [snd_pcm]

(1) HPD (headphone stream) is being closed and Deepbuffer is being setup (2) HPD pipeline is free by SOF and also its dai copier (node_id is clear to 0) (3) when checking BE for Deepbuffer in dpcm_be_dai_hw_params, find the BE is also used by HPD and the state is SND_SOC_DPCM_STATE_STOP, so skip above intel_hw_params-> sdw_params_stream ... > sof_ipc4_dai_config (4) Deepbuffer stream sends dai-copier to fw (node_id is not set for (3)) (5) HPD BE is set to SND_SOC_DPCM_STATE_HW_FREE

RanderWang avatar Jan 19 '24 07:01 RanderWang

Why we can't rely on use_count to share the sof widget between HPD & Deepbuffer stream ? The problem is that: we set up widget list in hw_params function but Deepbuffer stream is just open and before hw_params, so the use_count can't be increase at this time

Solution: set up widget list in pcm_open function and add use_count here and initialize widget in hw_param function

RanderWang avatar Jan 19 '24 08:01 RanderWang