LNL HDA pause-release issue
Started seeing this sort of issues today:
https://sof-ci.01.org/linuxpr/PR5044/build3423/devicetest/index.html?model=LNLM_RVP_HDA&testcase=check-pause-resume-capture-100
(100/100) Wait for 172 ms before resume
declare -- cmd="journalctl_cmd --since=@1717691003"
2024-06-06 16:23:56 UTC [REMOTE_INFO] Entering expect script with:
arecord -D hw:0,6 -r 48000 -c 4 -f S32_LE -vv -i /dev/null -q
spawn arecord -D hw:0,6 -r 48000 -c 4 -f S32_LE -vv -i /dev/null -q
Hardware PCM card 0 'sof-hda-dsp' device 6 subdevice 0
Its setup is:
stream : CAPTURE
access : RW_INTERLEAVED
format : S32_LE
subformat : STD
channels : 4
rate : 48000
exact rate : 48000 (48000/1)
msbits : 32
buffer_size : 24000
period_size : 6000
period_time : 125000
tstamp_mode : NONE
tstamp_type : MONOTONIC
period_step : 1
avail_min : 6000
period_event : 0
start_threshold : 1
stop_threshold : 24000
silence_threshold: 0
silence_size : 0
boundary : 6755399441055744000
appl_ptr : 0
hw_ptr : 0
##################################################+| MAX
##################################################+| MAX
##################################################+| MAX
...
##################################################+| MAX
##################################################+| MAX
##################################################+| MAX
##################################################+| MAX
##################################################+| MAX
##################################################+| MAX
##################################################+| MAX
##################################################+| MAX
2024-06-06 16:24:06 UTC [REMOTE_INFO] Starting func_exit_handler(1)
2024-06-06 16:24:06 UTC [REMOTE_ERROR] Starting func_exit_handler(), exit status=1, FUNCNAME stack:
2024-06-06 16:24:06 UTC [REMOTE_ERROR] main() @ /home/ubuntu/sof-test/test-case/check-pause-resume.sh
2024-06-06 16:24:06 UTC [REMOTE_INFO] pkill -TERM -f mtrace-reader.py
2024-06-06 16:24:06 UTC [REMOTE_INFO] nlines=1926 /home/ubuntu/sof-test/logs/check-pause-resume/2024-06-06-16:23:23-6831/mtrace.txt
+ grep -B 2 -A 1 -i --word-regexp -e ERR -e ERROR -e '' -e OSError /home/ubuntu/sof-test/logs/check-pause-resume/2024-06-06-16:23:23-6831/mtrace.txt
2024-06-06 16:24:06 UTC [REMOTE_INFO] ktime=583 sof-test PID=8280: ending
2024-06-06 16:24:06 UTC [REMOTE_INFO] Test Result: FAIL!
Nothing blatantly wrong in the dmesg log or mtrace.
@ujfalusi @fredoh9 @ssavati @marc-hb Is this a regression?
I cannot tell what is the reason for the fail to be honest.
Also spotted earlier in https://github.com/intel-innersource/drivers.audio.ci.sof-framework/issues/566#issuecomment-2146091310
I don't know what's going on.
I know that this test should first be fixed. I approved this fix from @fredoh9 a long time ago but @plbossart you still had reservations:
- https://github.com/thesofproject/sof-test/pull/931
(I forgot everything about 931)
cc:
- https://github.com/thesofproject/linux/issues/3766
- internal issue # 302
Is this a duplicate?
- https://github.com/thesofproject/sof/issues/9191
seen again in https://sof-ci.01.org/linuxpr/PR5064/build3677/devicetest/index.html?model=LNLM_RVP_HDA&testcase=check-pause-resume-capture-100
@kv2019i another problem to track for 2.10...
@plbossart Ack. Liam did move #9191 to 2.11, issues with pause-resume not blocking 2.10 release.
More recent reproduction today: https://sof-ci.01.org/sofpr/PR9235/build5580/devicetest/index.html?model=LNLM_RVP_HDA&testcase=multiple-pause-resume-50
Again in June 17th daily 42633?model=LNLM_RVP_HDA&testcase=multiple-pause-resume-50
So this test used to time out because "MAX" didn't match anything expected by the Expect script. I rewrote that script and named it case-lib/apause.exp in https://github.com/thesofproject/sof-test/pull/1218 which was merged today. For now, the rewrite will neither time out nor fail on "MAX" because I wanted the first script version to be "generous" and to assume it could be just a problem with ALSA settings. I needed this so the rewrite could be tested and merged without polluting results too much. But obviously, "MAX" is not always just a problem with ALSA settings. For instance, the initial "MAX pop" TGL issue #3766 does not look like a problem with ALSA settings (please prove me wrong).
That's why "MAX" can be turned into an error with just a one-line change in the new case-lib/apause.exp script. I intend to submit that change after a couple daily test runs.
That's why "MAX" can be turned into an error with just a one-line change in the new case-lib/apause.exp script. I intend to submit that change after a couple daily test runs.
We won't catch "MAX" volume after all because of wontfix TGL bug
- https://github.com/thesofproject/linux/issues/3766
MAX will stay just a WARNING.
Closing, please reply and re-open if you disagree.