sof icon indicating copy to clipboard operation
sof copied to clipboard

[BUG] IPC timeout in sof-logger-check

Open keqiaozhang opened this issue 2 years ago • 7 comments

Describe the bug CI detected some different failures when checking sof-logger:

  1. on ADLP_GMB_I2S_ZEPHYR:http://sof-ci.sh.intel.com/#/result/planresultdetail/14129?model=ADLP_GMB_I2S_ZEPHYR&testcase=check-sof-logger logger.error
error: in logger_read(), fread(..., /sys/kernel/debug/sof/trace) failed: Operation not permitted(-1)
  1. On APL_UP2_NOCODEC_ZEPHYR:http://sof-ci.sh.intel.com/#/result/planresultdetail/14124?model=APL_UP2_NOCODEC_ZEPHYR&testcase=check-sof-logger ** console output**
2022-07-20 16:19:15 UTC [REMOTE_INFO] Trying to get the DMA .ldc trace log with background sof-logger ...
2022-07-20 16:19:15 UTC [REMOTE_COMMAND] sudo /usr/local/bin/sof-logger  -t -f 3 -l  /etc/sof/sof-apl.ldc   >  /home/ubuntu/sof-test/logs/check-sof-logger/2022-07-20-16:19:11-29866/logger.data.txt  2>  /home/ubuntu/sof-test/logs/check-sof-logger/2022-07-20-16:19:11-29866/logger.error.txt  &
2022-07-20 16:19:17 UTC [REMOTE_ERROR] timeout /usr/local/bin/cavstool.py returned unexpected: 0
2022-07-20 16:19:17 UTC [REMOTE_ERROR] Starting func_exit_handler(), exit status=1, FUNCNAME stack:
2022-07-20 16:19:17 UTC [REMOTE_ERROR]  die()  @  /home/ubuntu/sof-test/case-lib/lib.sh
2022-07-20 16:19:17 UTC [REMOTE_ERROR]  run_loggers()  @  /home/ubuntu/sof-test/test-case/check-sof-logger.sh:114
2022-07-20 16:19:17 UTC [REMOTE_ERROR]  main()  @  /home/ubuntu/sof-test/test-case/check-sof-logger.sh:196
2022-07-20 16:19:17 UTC [REMOTE_ERROR]  main()  @  /home/ubuntu/sof-test/test-case/check-sof-logger.sh:263
  1. sof-logger check caused IPC timed out for 0x30020000 (msg/reply size: 12/12): https://sof-ci.sh.intel.com/#/result/planresultdetail/14104?model=ADLP_GMB_I2S_ZEPHYR&testcase=check-sof-logger ** dmesg**
[   94.492313] kernel: snd_sof:ipc3_log_header: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx: 0x30020000: GLB_TPLG_MSG: COMP_FREE
[   95.001159] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx timed out for 0x30020000 (msg/reply size: 12/12)
[   95.001164] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: preventing DSP entering D3 state to preserve context
[   95.001166] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ IPC dump start ]------------
[   95.001180] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: hda irq intsts 0x00000000 intlctl 0xc0000001 rirb 00
[   95.001181] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: dsp irq ppsts 0x00000000 adspis 0x00000000
[   95.001187] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: error: host status 0x00000000 dsp status 0x00000000 mask 0x00000003
[   95.001188] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ IPC dump end ]------------
[   95.001189] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ DSP dump start ]------------
[   95.001190] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: IPC timeout
[   95.001191] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: fw_state: SOF_FW_BOOT_COMPLETE (6)
[   95.001207] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: 0x00000005: module: ROM, state: FW_ENTERED, running
[   95.001460] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: unexpected fault 0x00000000 trace 0x00004000
[   95.001461] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ DSP dump end ]------------
[   95.001464] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: failed to free widget SSP2.IN

keqiaozhang avatar Jul 21 '22 01:07 keqiaozhang

Earlier there was this, I don't remember seeing it before but maybe I missed it. Hopefully unrelated.

[   90.723153] kernel: snd_hda_codec_hdmi ehdaudio0D2: ASoC: sink widget AIF1TX overwritten
[   90.723158] kernel: snd_hda_codec_hdmi ehdaudio0D2: ASoC: source widget AIF1RX overwritten
[   90.749485] kernel: max98390 i2c-MX98390:00: Direct firmware load for dsm_param_R_Google_Gimble4ES.bin failed with error -2
[   90.749516] kernel: max98390 i2c-MX98390:00: Direct firmware load for dsm_param.bin failed with error -2
[   90.749543] kernel: max98390 i2c-MX98390:00: Direct firmware load for dsmparam.bin failed with error -2
[   90.776064] kernel: max98390 i2c-MX98390:01: Direct firmware load for dsm_param_L_Google_Gimble4ES.bin failed with error -2
[   90.776094] kernel: max98390 i2c-MX98390:01: Direct firmware load for dsm_param.bin failed with error -2
[   90.776120] kernel: max98390 i2c-MX98390:01: Direct firmware load for dsmparam.bin failed with error -2
[   90.780714] kernel: snd_sof:sof_pcm_new: sof-audio-pci-intel-tgl 0000:00:1f.3: creating new PCM DMIC16kHz

marc-hb avatar Jul 21 '22 05:07 marc-hb

I'm afraid cavstool logging recently added by https://github.com/thesofproject/sof-test/pull/929 is triggering these issues , seems to affect thesofproject/sof-test#897 too.

marc-hb avatar Jul 21 '22 05:07 marc-hb

IPC timeout again in daily run 14165?model=ADLP_GMB_I2S_ZEPHYR&testcase=check-sof-logger

This time APL (and everything else) was fine

marc-hb avatar Jul 21 '22 22:07 marc-hb

No check-sof-logger failure in recent daily runs and the zephyr logs are there... maybe someone fixed this while fixing something else?

  • No check-sof-logger failure in daily 14218 Start Time: 2022-07-25 21:29:47 UTC

  • No check-sof-logger failure in daily 14198 Start Time: 2022-07-24 21:29:42 UTC Kernel Branch: topic/sof-dev Kernel Commit: 70f9e7b34a9d SOF Branch: main SOF Commit: 2845fc8cdec2 Zephyr Commit: e81c0b07e248

  • Last failure so far was in daily 14165?model=ADLP_GMB_I2S_ZEPHYR&testcase=check-sof-logger and only on one device Start Time: 2022-07-21 21:29:41 UTC Kernel Branch: topic/sof-dev Kernel Commit: 0d1355b1845e SOF Branch: main SOF Commit: 104c63547f05 Zephyr Commit: 31b9a6acc60a

marc-hb avatar Jul 26 '22 00:07 marc-hb

Failure is back today in daily 14276?model=ADLP_GMB_I2S_ZEPHYR&testcase=check-sof-logger on just one system :-(

Start Time: 2022-07-27 21:29:44 UTC Kernel Branch: topic/sof-dev Kernel Commit: bf69e513b171 SOF Branch: main SOF Commit: e4c2872b60f6 Zephyr Commit: d5222f318d29

marc-hb avatar Jul 27 '22 23:07 marc-hb

I got distracted by the migration to west but I have some good leads, stay tuned.

marc-hb avatar Jul 29 '22 21:07 marc-hb

Earlier there was this, I don't remember seeing it before but maybe I missed it. Hopefully unrelated.

[   90.723153] kernel: snd_hda_codec_hdmi ehdaudio0D2: ASoC: sink widget AIF1TX overwritten
[   90.723158] kernel: snd_hda_codec_hdmi ehdaudio0D2: ASoC: source widget AIF1RX overwritten
[   90.749485] kernel: max98390 i2c-MX98390:00: Direct firmware load for dsm_param_R_Google_Gimble4ES.bin failed with error -2
[   90.749516] kernel: max98390 i2c-MX98390:00: Direct firmware load for dsm_param.bin failed with error -2
[   90.749543] kernel: max98390 i2c-MX98390:00: Direct firmware load for dsmparam.bin failed with error -2
[   90.776064] kernel: max98390 i2c-MX98390:01: Direct firmware load for dsm_param_L_Google_Gimble4ES.bin failed with error -2
[   90.776094] kernel: max98390 i2c-MX98390:01: Direct firmware load for dsm_param.bin failed with error -2
[   90.776120] kernel: max98390 i2c-MX98390:01: Direct firmware load for dsmparam.bin failed with error -2
[   90.780714] kernel: snd_sof:sof_pcm_new: sof-audio-pci-intel-tgl 0000:00:1f.3: creating new PCM DMIC16kHz

Above errors are unrelated to this bug. it's just looking for the firmware dsm_param.bin of the codec in /opt/google/dsm/dsmparam.bin. We just lack DSM algorithm, which will not affect the audio function.

keqiaozhang avatar Aug 02 '22 01:08 keqiaozhang

Won't fix cAVS tool issue. Will switch to new logging tool.

mengdonglin avatar Sep 06 '22 01:09 mengdonglin