sof icon indicating copy to clipboard operation
sof copied to clipboard

[BUG] Failed capture tests on DELL SKU 0A32 with SOF 2.2

Open plbossart opened this issue 2 years ago • 9 comments

Describe the bug

Running sof-test with SOF 2.2 production-signed firmware results in failure for all capture tests

FAIL: capture-d100l1r1; capture_d1l100r1; capture_d1l1r50;

This is on a device with Fedora 36 installed (pipewire disabled).

To Reproduce TPLG=/lib/firmware/intel/sof-tplg/sof-tgl-rt711-rt1316-rt714.tplg MODEL=TGLU_SKU0A32_SDCA ~/sof-test/test-case/check-capture.sh -d 10 -l 1 -r 1

Reproduction Rate 100%

Expected behavior This device seems to be perfectly supported by SOF in Intel daily test, something is off in either firmware or environment.

Impact What impact does this issue have on your progress (e.g., annoyance, showstopper)

Environment

  1. Branch name and commit hash of the 2 repositories: sof (firmware/topology) and linux (kernel driver).
    • Kernel: {55946d870a22796486c0bb97463945ba3c82c99a (HEAD -> merge/sound-upstream-20220801
    • SOF: 2.2 production signed
    • sof-test 6544bc19b5537e61febc6fd164f08de908ccbc49
  2. Name of the topology file
    • Topology: sof-tgl-rt711-rt1316-rt714.tplg
  3. Name of the platform(s) on which the bug is observed.
    • Platform: DELL SKU 0A32

Screenshots or console output

[root@fedora ~]# TPLG=/lib/firmware/intel/sof-tplg/sof-tgl-rt711-rt1316-rt714.tplg MODEL=TGLU_SKU0A32_SDCA ~/sof-test/test-case/check-capture.sh -d 10 -l 1 -r 1
2022-08-01 20:40:30 UTC [INFO] ktime=680 sof-test PID=1712: starting
2022-08-01 20:40:30 UTC [INFO] DUT is running IPC3 mode
2022-08-01 20:40:30 UTC [INFO] Starting /usr/local/bin/sof-logger -t -l /lib/firmware/intel/sof/sof-tgl.ldc
2022-08-01 20:40:30 UTC [INFO] /root/sof-test/test-case/check-capture.sh will use topology /usr/lib/firmware/intel/sof-tplg-v2.2/sof-tgl-rt711-rt1316-rt714.tplg to run the test case
2022-08-01 20:40:30 UTC [INFO] Pipeline list to ignore is specified, will ignore 'pcm=HDA Digital' in test case
2022-08-01 20:40:30 UTC [INFO] Run command to get pipeline parameters
2022-08-01 20:40:30 UTC [COMMAND] sof-tplgreader.py /usr/lib/firmware/intel/sof-tplg-v2.2/sof-tgl-rt711-rt1316-rt714.tplg -f 'type:capture & id:any' -b ' pcm:HDA Digital' -s 0 -e
2022-08-01 20:40:30 UTC [INFO] ===== Testing: (Round: 1/1) (PCM: Jack In [hw:0,1]<capture>) (Loop: 1/1) =====
2022-08-01 20:40:30 UTC [INFO] no file prefix, use /dev/null as dummy capture output
2022-08-01 20:40:30 UTC [COMMAND] arecord   -Dhw:0,1 -r 48000 -c 2 -f S16_LE -d 10 /dev/null -v -q
Hardware PCM card 0 'sof-soundwire' device 1 subdevice 0
Its setup is:
  stream       : CAPTURE
  access       : RW_INTERLEAVED
  format       : S16_LE
  subformat    : STD
  channels     : 2
  rate         : 48000
  exact rate   : 48000 (48000/1)
  msbits       : 16
  buffer_size  : 16384
  period_size  : 4096
  period_time  : 85333
  tstamp_mode  : NONE
  tstamp_type  : MONOTONIC
  period_step  : 1
  avail_min    : 4096
  period_event : 0
  start_threshold  : 1
  stop_threshold   : 16384
  silence_threshold: 0
  silence_size : 0
  boundary     : 4611686018427387904
  appl_ptr     : 0
  hw_ptr       : 0
2022-08-01 20:40:41 UTC [ERROR] Starting func_exit_handler(), exit status=1, FUNCNAME stack:
2022-08-01 20:40:41 UTC [ERROR]  arecord_opts()  @  /root/sof-test/test-case/../case-lib/lib.sh
2022-08-01 20:40:41 UTC [ERROR]  main()  @  /root/sof-test/test-case/check-capture.sh:97
2022-08-01 20:40:41 UTC [INFO] Starting /usr/local/bin/sof-logger -l /lib/firmware/intel/sof/sof-tgl.ldc
2022-08-01 20:40:43 UTC [INFO] pkill -TERM sof-logger
Terminated
2022-08-01 20:40:44 UTC [INFO] nlines=3 /root/sof-test/logs/check-capture/2022-08-01-16:40:30-31205/etrace.txt
2022-08-01 20:40:44 UTC [INFO] ktime=694 sof-test PID=1712: ending
+ timeout -s CONT 7 sudo sync

real	0m0.032s
user	0m0.012s
sys	0m0.009s
+ timeout -s CONT 5 dd if=/dev/zero of=/root/HD_TEST_DELETE_ME bs=1M count=100 conv=fsync
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.0646952 s, 1.6 GB/s
+ timeout -s CONT 7 sudo sync

real	0m0.038s
user	0m0.003s
sys	0m0.026s
2022-08-01 20:40:45 UTC [INFO] Test Result: FAIL!

it's not exactly clear what happens from the log.

plbossart avatar Aug 01 '22 20:08 plbossart

The same test seems to pass in Intel internal tests https://sof-ci.sh.intel.com/#/result/planresultdetail/14339?model=TGLU_SKU0A32_SDCA&testcase=check-capture-10sec

@keqiaozhang @marc-hb any ideas what could have gone sideways or be dependent on Ubuntu v. Fedora setups?

plbossart avatar Aug 01 '22 20:08 plbossart

2022-08-01 20:40:41 UTC [ERROR]  arecord_opts()  @  /root/sof-test/test-case/../case-lib/lib.sh
2022-08-01 20:40:41 UTC [ERROR]  main()  @  /root/sof-test/test-case/check-capture.sh:97

It looks like arecord is simply crashing without any error message.

There's no sof-test error message either because a long time I made the mistake of adding set -e before fixing shellcheck warnings:

In test-case/check-capture.sh line 98:
                if [[ $? -ne 0 ]]; then
                      ^-- SC2181: Check exit code directly with e.g. 'if mycmd;', not indirectly with $?.

can you please try this fix?

--- a/test-case/check-capture.sh
+++ b/test-case/check-capture.sh
@@ -94,10 +94,10 @@ do
                     dlogi "using $file as capture output"
                 fi
 
-                arecord_opts -D$dev -r $rate -c $channel -f $fmt_elem -d $duration $file -v -q
-                if [[ $? -ne 0 ]]; then
+                arecord_opts -D$dev -r $rate -c $channel -f $fmt_elem -d $duration $file -v -q || {
                     func_lib_lsof_error_dump $snd
                     die "arecord on PCM $dev failed at $i/$loop_cnt."
-                fi
+                }
             done
         done

marc-hb avatar Aug 01 '22 21:08 marc-hb

@marc-hb thanks, I removed the spurious leftover 'fi' and looks indeed like an arecord status issue

arecord   -Dhw:0,1 -r 48000 -c 2 -f S16_LE -d 10 /dev/null -v -q || echo "this arecord failed"
Hardware PCM card 0 'sof-soundwire' device 1 subdevice 0
Its setup is:
  stream       : CAPTURE
  access       : RW_INTERLEAVED
  format       : S16_LE
  subformat    : STD
  channels     : 2
  rate         : 48000
  exact rate   : 48000 (48000/1)
  msbits       : 16
  buffer_size  : 16384
  period_size  : 4096
  period_time  : 85333
  tstamp_mode  : NONE
  tstamp_type  : MONOTONIC
  period_step  : 1
  avail_min    : 4096
  period_event : 0
  start_threshold  : 1
  stop_threshold   : 16384
  silence_threshold: 0
  silence_size : 0
  boundary     : 4611686018427387904
  appl_ptr     : 0
  hw_ptr       : 0
this arecord failed

plbossart avatar Aug 01 '22 22:08 plbossart

@plbossart I assume if we arecord to a file rather than /dev/null the file is empty ? i.e. we dont capture any data ?

lgirdwood avatar Aug 02 '22 09:08 lgirdwood

@marc-hb thanks, I removed the spurious leftover 'fi'

Thanks, I edited my comment above. As you have tested it, could you please submit this as a quick sof-test fix? Thanks in advance.

marc-hb avatar Aug 02 '22 10:08 marc-hb

@plbossart I assume if we arecord to a file rather than /dev/null the file is empty ? i.e. we dont capture any data ?

indeed checking the file size would be a good way to check that something was captured...

plbossart avatar Aug 02 '22 14:08 plbossart

@plbossart @marc-hb So is there any bug aside arecord crashing on Fedora?

kv2019i avatar Aug 16 '22 07:08 kv2019i

@plbossart @marc-hb So is there any bug aside arecord crashing on Fedora?

it's not really crashing @kv2019i, only returning a status that is understood as a failure. That must be an upstream change IMHO, but I don't really have time to bisect it this week.

edit: I meant alsa-utils upstream change.

plbossart avatar Aug 16 '22 11:08 plbossart

Ack @plbossart , let's keep this open. Just wondering whether this is the right tracker for the bug, or is this is a sof-test issue. But got it, we don't really know yet.

kv2019i avatar Aug 16 '22 15:08 kv2019i

root caused to https://github.com/alsa-project/alsa-utils/issues/163, this is a regression in alsa-utils release 1.2.7, fixed in alsa-utils upstream.

plbossart avatar Aug 22 '22 14:08 plbossart