[BUG] DAI xrun happened when testing suspend/resume with playback on CML_RVP_NOCODEC_ZEPHYR
Describe the bug CI detected the dai_xrun on CML_RVP_NOCODEC_ZEPHYR when testing the suspend/resume with playback. Test ID:13474
No errors in dmesg, just observed the xrun errors in slogger.
[ 7747735.577549] ( 42430.156250) c0 zll-schedule src/schedule/zephyr_ll.c:144 INFO ll task 0xbe086100 pipe-task avg 564, max 936
[ 7747813.910880] ( 78.333328) c0 zll-schedule src/schedule/zephyr_ll.c:144 INFO ll task 0xbe086240 pipe-task avg 1175, max 1190
[ 8290213.160160] ( 542399.250000) c0 host 8.41 src/audio/host-legacy.c:474 INFO no bytes to copy, 768 free in buffer, 0 available in DMA
[ 8291462.847610] ( 1249.687500) c0 host 8.41 src/audio/host-legacy.c:474 INFO no bytes to copy, 768 free in buffer, 0 available in DMA
[ 8291502.743442] ( 39.895832) c0 dai 3.14 src/audio/dai-legacy.c:1002 WARN dai_copy(): nothing to copy
[ 8292712.639227] ( 1209.895752) c0 host 8.41 src/audio/host-legacy.c:474 INFO no bytes to copy, 768 free in buffer, 0 available in DMA
[ 8292752.430892] ( 39.791664) c0 dai 3.14 src/audio/dai-legacy.c:1002 WARN dai_copy(): nothing to copy
[ 8293962.743344] ( 1210.312500) c0 host 8.41 src/audio/host-legacy.c:474 INFO no bytes to copy, 768 free in buffer, 0 available in DMA
[ 8294002.535009] ( 39.791664) c0 dai 3.14 src/audio/dai-legacy.c:994 WARN dai_copy(): Copy_bytes 0 + avail bytes 8 < period bytes 384, possible glitch
[ 8294014.514175] ( 11.979166) c0 dai 3.14 src/audio/dai-legacy.c:1002 WARN dai_copy(): nothing to copy
[ 8295213.107878] ( 1198.593750) c0 host 8.41 src/audio/host-legacy.c:474 INFO no bytes to copy, 768 free in buffer, 0 available in DMA
[ 8295250.868293] ( 37.760414) c0 dw-dma src/drivers/dw/dma.c:1128 ERROR dw_dma_get_data_size(): xrun detected
[ 8295282.847458] ( 31.979166) c0 dai 3.14 src/audio/dai-legacy.c:923 ERROR dai_report_xrun(): underrun due to no data available
[ 8295308.003707] ( 25.156248) c0 dai 3.14 src/audio/dai-legacy.c:777 ERROR comp_underrun(): dev->comp.id = 14, source->avail = 0, copy_bytes = 0
[ 8295335.868290] ( 27.864582) c0 pipe 3.15 ....../pipeline-stream.c:270 INFO pipe trigger cmd 6
[ 8295361.232872] ( 25.364582) c0 pipe 3.15 ....../pipeline-stream.c:143 ERROR pipeline_copy(): ret = -61, start->comp.id = 14, dir = 1
[ 8295388.107871] ( 26.874998) c0 pipe 3.15 ..../pipeline-schedule.c:187 ERROR pipeline_task(): xrun recovery failed! pipeline is stopped.
[ 8295414.566203] ( 26.458332) c0 zll-schedule src/schedule/zephyr_ll.c:69 INFO task complete 0xbe086240 pipe-task
[ 8295426.128703] ( 11.562500) c0 zll-schedule src/schedule/zephyr_ll.c:71 INFO num_tasks 3 total_num_tasks 3
[ 8296456.649495] ( 1030.520752) c0 zll-schedule src/schedule/zephyr_ll.c:69 INFO task complete 0xbe086100 pipe-task
To Reproduce
TPLG=/lib/firmware/intel/sof-tplg/sof-cnl-nocodec.tplg MODEL=CML_RVP_NOCODEC_ZEPHYR ~/sof-test/test-case/check-suspend-resume-with-audio.sh -l 5 -m playback
Reproduction Rate
under check, no reproductions so far.
Environment 1) * Kernel: sof-dev/076a5996b923 * SOF: main/fe60b033932e * Zephyr: 7f6b762a734b 2) Name of the topology file * Topology: sof-cnl-nocodec.tplg 3) Name of the platform(s) on which the bug is observed. * Platform: CML_RVP_NOCODEC_ZEPHYR
Screenshots or console output
2022-06-21 23:43:54 UTC Sub-Test: [REMOTE_INFO] ===== Round(5/5) =====
2022-06-21 23:43:54 UTC Sub-Test: [REMOTE_COMMAND] Run the command: rtcwake -m mem -s 5
rtcwake: assuming RTC uses UTC ...
rtcwake: wakeup from "mem" using /dev/rtc0 at Tue Jun 21 23:44:00 2022
2022-06-21 23:44:02 UTC Sub-Test: [REMOTE_COMMAND] sleep for 5
2022-06-21 23:44:07 UTC Sub-Test: [REMOTE_INFO] Check for the kernel log status
declare -- cmd="journalctl_cmd --since=@1655855033"
2022-06-21 23:44:07 UTC Sub-Test: [REMOTE_INFO] Check for wakeup_count and suspend_stats
/sys/power/suspend_stats/fail:0
/sys/power/suspend_stats/failed_freeze:0
/sys/power/suspend_stats/failed_prepare:0
/sys/power/suspend_stats/failed_resume:0
/sys/power/suspend_stats/failed_resume_early:0
/sys/power/suspend_stats/failed_resume_noirq:0
/sys/power/suspend_stats/failed_suspend:0
/sys/power/suspend_stats/failed_suspend_late:0
/sys/power/suspend_stats/failed_suspend_noirq:0
/sys/power/suspend_stats/last_failed_dev:
/sys/power/suspend_stats/last_failed_errno:0
/sys/power/suspend_stats/last_failed_step:
/sys/power/suspend_stats/success:60
name active_count event_count wakeup_count expire_count active_since total_time max_time last_change prevent_suspend_time
--- /home/ubuntu/sof-test/logs/check-suspend-resume-with-audio/2022-06-21-23:41:49-4664/initial_wakeup_sources.txt 2022-06-21 23:43:02.333663932 +0000
+++ /sys/kernel/debug/wakeup_sources 2022-06-21 21:32:47.041000000 +0000
@@ -9,13 +9,13 @@
0000:00:14.0 0 0 0 0 0 0 0 0 0
mmc0 0 0 0 0 0 0 0 0 0
alarmtimer.0.auto 0 0 0 0 0 0 0 0 0
-rtc_cmos 55 55 0 0 0 20 0 7485724 0
+rtc_cmos 60 60 0 0 0 22 0 7526685 0
0000:00:1f.6 0 0 0 0 0 0 0 0 0
PNP0C0A:02 0 0 0 0 0 0 0 0 0
PNP0C0A:01 0 0 0 0 0 0 0 0 0
PNP0C0A:00 0 0 0 0 0 0 0 0 0
PNP0C0E:00 0 0 0 0 0 0 0 0 0
-PNP0C0D:00 55 55 0 0 0 0 0 7485596 0
+PNP0C0D:00 60 60 0 0 0 0 0 7526559 0
ADP1 1 1 0 0 0 0 0 1721 0
INT34BB:00 0 0 0 0 0 0 0 0 0
PNP0C0C:00 0 0 0 0 0 0 0 0 0
2022-06-21 23:44:07 UTC Sub-Test: [REMOTE_WARNING] /sys/power/wakeup_count is 121, expected 120
declare -- cmd="journalctl_cmd --since=@1655855033"
+ timeout -s CONT 7 sudo sync
real 0m0.017s
user 0m0.001s
sys 0m0.008s
+ timeout -s CONT 5 dd if=/dev/zero of=/home/ubuntu/HD_TEST_DELETE_ME bs=1M count=100 conv=fsync
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.248591 s, 422 MB/s
+ timeout -s CONT 7 sudo sync
real 0m1.543s
user 0m0.005s
sys 0m0.012s
aplay: pcm_write:2086: write error: Input/output error
2022-06-21 23:44:09 UTC Sub-Test: [REMOTE_INFO] Test Result: PASS!
declare -- cmd="journalctl_cmd --since=@1655854979"
process: 128858, status: not found
lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/121/gvfs
Output information may be incomplete.
2022-06-21 23:44:10 UTC [REMOTE_ERROR] process status is abnormal
2022-06-21 23:44:10 UTC [REMOTE_INFO] dump ps for aplay & arecord
ubuntu 129694 127924 0 23:44 ? 00:00:00 grep -E aplay|arecord
2022-06-21 23:44:10 UTC [REMOTE_INFO] dump ps for child process
UID PID PPID C STIME TTY TIME CMD
ubuntu 127991 127924 0 23:41 ? 00:00:00 /bin/bash /home/ubuntu/sof-test/test-case/check-suspend-resume-with-audio.sh -l 5 -m playback
ubuntu 129696 127924 0 23:44 ? 00:00:00 ps --ppid 127924 -f
2022-06-21 23:44:10 UTC [REMOTE_ERROR] Starting func_exit_handler(), exit status=1, FUNCNAME stack:
2022-06-21 23:44:10 UTC [REMOTE_ERROR] main() @ /home/ubuntu/sof-test/test-case/check-suspend-resume-with-audio.sh
2022-06-21 23:44:10 UTC [REMOTE_INFO] Starting /usr/local/bin/sof-logger -l /etc/sof/sof-cml.ldc -o /home/ubuntu/sof-test/logs/check-suspend-resume-with-audio/2022-06-21-23:41:49-4664/etrace.txt
2022-06-21 23:44:12 UTC [REMOTE_INFO] pkill -TERM sof-logger
Terminated
2022-06-21 23:44:13 UTC [REMOTE_INFO] nlines=2 /home/ubuntu/sof-test/logs/check-suspend-resume-with-audio/2022-06-21-23:41:49-4664/etrace.txt
2022-06-21 23:44:13 UTC [REMOTE_INFO] ktime=7539 sof-test PID=127924: ending
+ timeout -s CONT 7 sudo sync
real 0m0.066s
user 0m0.004s
sys 0m0.013s
+ timeout -s CONT 5 dd if=/dev/zero of=/home/ubuntu/HD_TEST_DELETE_ME bs=1M count=100 conv=fsync
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.243662 s, 430 MB/s
+ timeout -s CONT 7 sudo sync
real 0m0.048s
user 0m0.001s
sys 0m0.011s
2022-06-21 23:44:14 UTC [REMOTE_INFO] Test Result: FAIL!
Observed this issue again in daily test:http://sof-ci.sh.intel.com/#/result/planresultdetail/13859?model=CML_RVP_NOCODEC_ZEPHYR&testcase=check-suspend-resume-with-playback-5
This is a very old bug and only happened twice in CI. This model is not in our testing scope now. Closing this bug.