sof icon indicating copy to clipboard operation
sof copied to clipboard

[BUG[IPC4] ipc error 9 for msg 0x45010002|0x10003 MOD_BIND on TGLU_UP_HDA_IPC4ZPH

Open keqiaozhang opened this issue 3 years ago • 4 comments

Describe the bug We observed this issue in our CI test, this issue happens on TGLU_UP_HDA_IPC4ZPH when testing 'simultaneous-playback-capture-25', It fails after several iterations. The reproduction is 100%.

[   76.013310] kernel: snd_sof:sof_ipc4_route_setup: sof-audio-pci-intel-tgl 0000:00:1f.3: bind mixin.4.1 -> mixout.3.1
[   76.013312] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx      : 0x45010002|0x10003: MOD_BIND
[   76.013445] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx reply: 0x65000009|0x10003: MOD_BIND
[   76.013447] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: FW reported error: 9 - Specified resource not found
[   76.013457] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc error for msg 0x45010002|0x10003
[   76.013459] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: sof_ipc4_route_setup: failed to bind modules mixin.4.1 -> mixout.3.1
[   76.013463] kernel: snd_sof:sof_widget_free: sof-audio-pci-intel-tgl 0000:00:1f.3: widget copier.HDA.4.1 freed
[   76.013465] kernel: snd_sof:sof_widget_free: sof-audio-pci-intel-tgl 0000:00:1f.3: widget gain.4.1 freed

To Reproduce TPLG=/lib/firmware/intel/avs-tplg/cavs-mixin-mixout-hda.tplg MODEL=TGLU_UP_HDA_IPC4ZPH ~/sof-test/test-case/simultaneous-playback-capture.sh -l 25

Reproduction Rate 100%

dmesg image

Environment

  1. Branch name and commit hash of the 2 repositories: sof (firmware/topology) and linux (kernel driver). Kernel Branch: topic/sof-dev Kernel Commit: 4e6ca75f7357 SOF Branch: main SOF Commit: 1044e004e4ec Zephyr Commit: zephyr-v3.1.0-3042-g8e55e59c5917
  2. Name of the topology file
    • Topology: cavs-mixin-mixout-hda.tplg
  3. Name of the platform(s) on which the bug is observed.
    • Platform: TGLU_UP_HDA_IPC4ZPH

console output

2022-08-29 21:52:31 UTC [REMOTE_INFO] ===== Testing: (Loop: 10/25) =====
2022-08-29 21:52:31 UTC [REMOTE_COMMAND] aplay -D hw:0,0 -c 2 -r 48000 -f S16_LE /dev/zero -q &
2022-08-29 21:52:31 UTC [REMOTE_COMMAND] arecord -D hw:0,0 -c 2 -r 48000 -f S16_LE /dev/null -q &
2022-08-29 21:52:31 UTC [REMOTE_INFO] Preparing to sleep for 5
arecord: set_params:1432: Unable to install hw params:
ACCESS:  RW_INTERLEAVED
FORMAT:  S16_LE
SUBFORMAT:  STD
SAMPLE_BITS: 16
FRAME_BITS: 32
CHANNELS: 2
RATE: 48000
PERIOD_TIME: (85333 85334)
PERIOD_SIZE: 4096
PERIOD_BYTES: 16384
PERIODS: 4
BUFFER_TIME: (341333 341334)
BUFFER_SIZE: 16384
BUFFER_BYTES: 65536
TICK_TIME: 0
2022-08-29 21:52:36 UTC [REMOTE_INFO] check pipeline after 5s
/home/ubuntu/sof-test/test-case/simultaneous-playback-capture.sh: line 107: kill: (2078) - No such process
2022-08-29 21:52:36 UTC [REMOTE_ERROR] Error in arecord process after sleep.
/home/ubuntu/sof-test/test-case/simultaneous-playback-capture.sh: line 67: kill: (2078) - No such process

dmesg.txt

keqiaozhang avatar Aug 30 '22 02:08 keqiaozhang

Observed another type of IPC error 104 for msg 0x40030004|0x30015: MOD_INIT_INSTANCE on TGLU_UP_HDA_IPC4ZPH.

[  464.272868] kernel: snd_sof:sof_widget_setup: sof-audio-pci-intel-tgl 0000:00:1f.3: widget gain.3.1 setup complete
[  464.272869] kernel: snd_sof:sof_ipc4_widget_setup: sof-audio-pci-intel-tgl 0000:00:1f.3: Create widget copier.host.3.1 instance 3 - pipe 3 - core 0
[  464.272870] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx      : 0x40030004|0x30015: MOD_INIT_INSTANCE [data size: 84]
[  464.273158] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx reply: 0x60000068|0x30015: MOD_INIT_INSTANCE
[  464.273160] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: FW reported error: 104 - Other failure of module instance initialization request
[  464.273168] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc error for msg 0x40030004|0x30015
[  464.273170] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: failed to create module copier.host.3.1
[  464.273172] kernel: snd_sof:sof_widget_free: sof-audio-pci-intel-tgl 0000:00:1f.3: widget gain.3.1 freed

keqiaozhang avatar Sep 16 '22 01:09 keqiaozhang

With https://github.com/thesofproject/sof/pull/6318 merged, I now hit this exact case on TGL nocodec topology configuration (with pending fixes to #6220 applied locally).

kv2019i avatar Oct 04 '22 17:10 kv2019i

Kernel logs shows:

[  250.273037] sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx reply: 0x60000068|0x70015: MOD_INIT_INSTANCE
[  250.273039] sof-audio-pci-intel-tgl 0000:00:1f.3: FW reported error: 104 - Other failure of module instance initialization request
[  250.273048] sof-audio-pci-intel-tgl 0000:00:1f.3: ipc error for msg 0x40010004|0x70015
[  250.273052] sof-audio-pci-intel-tgl 0000:00:1f.3: failed to create module copier.host.7.1
[  250.273055] sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx      : 0x12070000|0x0: GLB_DELETE_PIPELINE
[  250.273213] sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx reply: 0x32000000|0x0: GLB_DELETE_PIPELINE
[  250.273225] sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx done : 0x12070000|0x0: GLB_DELETE_PIPELINE
[  250.273227] sof-audio-pci-intel-tgl 0000:00:1f.3: widget pipeline.7 freed
[  250.273228] sof-audio-pci-intel-tgl 0000:00:1f.3: widget gain.7.1 freed

FW log

[    9.624000] <inf> pipe: pipeline new pipe_id 7 priority 0
[    9.624000] <inf> module_adapter: module_init() start
[    9.624000] <inf> module_adapter: module_init() done
[    9.624000] <inf> ipc: buffer new size 0x0 id 7.0 flags 0x0
[    9.624000] <err> buffer: buffer_alloc(): new size = 0 is invalid
[    9.624000] <err> copier: unenable to create host
[    9.624000] <err> ipc: error: failed to init module 4 : 1
[    9.624000] <err> ipc: ipc4: 1 failed err 104

So it seems create_endpoint_buffer() is passed invalid configuration. "copier_cfg->base.obs" must be zero in this case. Assigning to myself for further debug.

kv2019i avatar Oct 04 '22 17:10 kv2019i

Curious finding today. I can hit this error with "sof-test/test-case/check-capture.sh -d 1 -l 100 -r 1". UPDATE: this was done with https://github.com/thesofproject/sof/pull/6318 included (has been since been reverted in mainline). so mixers and gains were used in the topology for nocodec.

When built with XCC, I observe 1/10 failures (1/1000 reproduction rate, 10100 runs) When built with gcc (Zephyr SDK 0.15), passrate is 100%. I've let the gcc test run for 20100 iterations and still no errors.

This may be related to timing as XCC creates faster code (e.g. hifi enabled), but curious finding nevertheless.

kv2019i avatar Oct 07 '22 14:10 kv2019i

This issue is not reproducible in CI for days. Closing this one.

keqiaozhang avatar Oct 24 '22 06:10 keqiaozhang