sof icon indicating copy to clipboard operation
sof copied to clipboard

[BUG] [stable-v2.2] [NOCODEC] ERROR pipeline_comp_reset(): failed to recover (multiple-pipeline-all)

Open marc-hb opened this issue 1 year ago • 8 comments

Describe the bug

On ADLP_RVP_NOCODEC-ipc3 and APL_UP2_NOCODEC-ipc3, using all pipeline concurrently prints this error repeatedly.

EDIT: some error in some TWL configurations.

This error is likely old, discovered only now thanks to:

  • https://github.com/thesofproject/sof-test/pull/1188
[     6038167.207981] (           8.229166) c0 pipe         11.21 ......./pipeline-graph.c:357  INFO pipe reset
[     6038211.791313] (          44.583332) c0 pipe         5.3   ......./pipeline-graph.c:333  WARN pipeline_comp_reset(): component is in active state, try to stop it
[     6038221.739229] (           9.947916) c0 pipe         5.3   ......./pipeline-graph.c:336  ERROR pipeline_comp_reset(): failed to recover
[     6038634.499630] (         412.760406) c0 ipc                  src/ipc/ipc3/handler.c:1605 INFO ipc: new cmd 0x60050000
--
[    14446145.207212] (           8.020833) c0 pipe         9.16  ......./pipeline-graph.c:357  INFO pipe reset
[    14446189.582210] (          44.375000) c0 pipe         5.3   ......./pipeline-graph.c:333  WARN pipeline_comp_reset(): component is in active state, try to stop it
[    14446199.686376] (          10.104166) c0 pipe         5.3   ......./pipeline-graph.c:336  ERROR pipeline_comp_reset(): failed to recover
[    14446643.175942] (         443.489563) c0 ipc                  src/ipc/ipc3/handler.c:1605 INFO ipc: new cmd 0x60050000
--
[    22854418.831430] (           8.125000) c0 pipe         11.21 ......./pipeline-graph.c:357  INFO pipe reset
[    22854462.946012] (          44.114582) c0 pipe         5.3   ......./pipeline-graph.c:333  WARN pipeline_comp_reset(): component is in active state, try to stop it
[    22854472.998095] (          10.052083) c0 pipe         5.3   ......./pipeline-graph.c:336  ERROR pipeline_comp_reset(): failed to recover
[    22854881.070995] (         408.072906) c0 ipc                  src/ipc/ipc3/handler.c:1605 INFO ipc: new cmd 0x60050000

Example from daily test run: 41199?model=APL_UP2_NOCODEC-ipc3&testcase=multiple-pipeline-all-50

Same error message in: 41199?model=APL_UP2_NOCODEC-ipc3&testcase=multiple-pause-resume-25 41199?model=ADLP_RVP_NOCODEC-ipc3&testcase=multiple-pipeline-playback-50 41199?model=ADLP_RVP_NOCODEC-ipc3&testcase=multiple-pause-resume-25 41199?model=ADLP_RVP_NOCODEC-ipc3&testcase=multiple-pipeline-all-50

To Reproduce

TPLG=/lib/firmware/intel/sof-tplg/sof-adl-nocodec.tplg MODEL=ADLP_RVP_NOCODEC-ipc3 SOF_TEST_INTERVAL=5 ~/sof-test/test-case/multiple-pipeline.sh -f p -l 50

Reproduction Rate

100%?

Environment

Start Time: 2024-05-16 16:26:28 UTC End Time: 2024-05-16 19:23:14 UTC Linux Build Result: 1847 Linux Branch: topic/sof-dev Linux Commit: 055cba4766af KConfig Branch: master KConfig Commit: 8fee06f8fd8a

SOF Branch: stable-v2.2 SOF Commit: 64416ba51354

Screenshots or console output

[     6309919.280516] (           5.989583) c0 memory                      src/lib/alloc.c:578  INFO  128 Bytes blocks ID:1 base 0xbe059000
[     6309925.165933] (           5.885417) c0 memory                      src/lib/alloc.c:582  INFO    Number of Blocks: total 96sed 14 free 82
[     6309931.155516] (           5.989583) c0 memory                      src/lib/alloc.c:578  INFO  256 Bytes blocks ID:2 base 0xbe05c000
[     6309936.988849] (           5.833333) c0 memory                      src/lib/alloc.c:582  INFO    Number of Blocks: total 112sed 23 free 89
[     6309942.978432] (           5.989583) c0 memory                      src/lib/alloc.c:578  INFO  512 Bytes blocks ID:3 base 0xbe063000
[     6309948.811765] (           5.833333) c0 memory                      src/lib/alloc.c:582  INFO    Number of Blocks: total 12sed 4 free 8
[     6355311.570379] (       45362.757812) c0 dma-trace                 src/trace/trace.c:141  INFO Suppressed 3 similar messages:  %d Bytes blocks ID:%d base 0x%x
[     6355317.143296] (           5.572917) c0 dma-trace                 src/trace/trace.c:141  INFO Suppressed 3 similar messages:    Number of Blocks: total % used %d free %d
[     6355321.830796] (           4.687500) c0 ipc                  src/ipc/ipc3/handler.c:1605 INFO ipc: new cmd 0x60050000
[     6355329.434962] (           7.604167) c0 pipe         9.16  ....../pipeline-stream.c:270  INFO pipe trigger cmd 0
[     6356034.903684] (         705.468750) c0 ll-schedule        ./schedule/ll_schedule.c:159  INFO task complete 0xbe059380 pipe-task 
[     6356041.101600] (           6.197917) c0 ll-schedule        ./schedule/ll_schedule.c:162  INFO num_tasks 4 total_num_tasks 4
[     6356063.132849] (          22.031250) c0 host         11.17 ....../pipeline-stream.c:196  INFO perf comp_copy peak plat 226 cpu 224
[     6356284.799507] (         221.666656) c0 ipc                  src/ipc/ipc3/handler.c:1605 INFO ipc: new cmd 0x60030000
[     6356292.716174] (           7.916667) c0 pipe         9.16  ......./pipeline-graph.c:357  INFO pipe reset
[     6356336.309922] (          43.593750) c0 pipe         5.3   ......./pipeline-graph.c:333  WARN pipeline_comp_reset(): component is in active state, try to stop it
[     6356345.997422] (           9.687500) c0 pipe         5.3   ......./pipeline-graph.c:336  ERROR pipeline_comp_reset(): failed to recover
[     6356610.893244] (         264.895813) c0 ipc                  src/ipc/ipc3/handler.c:1605 INFO ipc: new cmd 0x60050000
[     6356619.486994] (           8.593750) c0 pipe         11.21 ....../pipeline-stream.c:270  INFO pipe trigger cmd 0
[     6357031.570311] (         412.083313) c0 ssp-dai      1.5   /drivers/intel/ssp/ssp.c:1140 INFO ssp_trigger() cmd 0
[     6357063.653643] (          32.083332) c0 ssp-dai      1.5   /drivers/intel/ssp/ssp.c:1108 INFO ssp_stop(), TX stop
[     6357071.205726] (           7.552083) c0 dw-dma                 src/drivers/dw/dma.c:413  INFO dw_dma_stop(): dma 0 channel 0 stop
[     6357085.111975] (          13.906249) c0 ll-schedule        ./schedule/ll_schedule.c:159  INFO task complete 0xbe059680 pipe-task 
[     6357090.736975] (           5.625000) c0 ll-schedule        ./schedule/ll_schedule.c:162  INFO num_tasks 3 total_num_tasks 3
[     6357099.695308] (           8.958333) c0 ll-schedule        ./schedule/ll_schedule.c:159  INFO task complete 0xbe059400 pipe-task 
[     6357105.372391] (           5.677083) c0 ll-schedule        ./schedule/ll_schedule.c:162  INFO num_tasks 2 total_num_tasks 2
[     6357324.591133] (         219.218735) c0 ipc                  src/ipc/ipc3/handler.c:1605 INFO ipc: new cmd 0x60030000
[     6357332.247382] (           7.656250) c0 pipe         11.21 ......./pipeline-graph.c:357  INFO pipe reset
[     6357380.684880] (          48.437496) c0 dai          5.2            src/audio/dai.c:671  INFO dai_reset()
[     6357391.257797] (          10.572916) c0 dw-dma                 src/drivers/dw/dma.c:260  INFO dw_dma_channel_put(): dma 0 channel 0 put
[     6357729.799450] (         338.541656) c0 ipc                  src/ipc/ipc3/handler.c:1605 INFO ipc: new cmd 0x30020000

marc-hb avatar May 16 '24 22:05 marc-hb

Exact same multi pipeline tests failing exactly the same in the same configuration in the daily test run of the next day (test run 41235)

EDIT: and again in 41277. 100% deterministic.

marc-hb avatar May 17 '24 21:05 marc-hb

@kv2019i in our daily CI , this issue still reproducible on stable-v2.2 (IPC3) branch but not see on main branch (IPC4). is this PR https://github.com/thesofproject/sof/pull/9136 need to land stable-v2.2 also ?

ssavati avatar Jul 08 '24 07:07 ssavati

SOF error logs

2024-07-07 18:26:51 UTC [REMOTE_INFO] nlines=80166 /home/ubuntu/sof-test/logs/multiple-pipeline/2024-07-07-18:21:01-27918/slogger.txt
+ grep -B 2 -A 1 -i --word-regexp -e ERR -e ERROR -e '' -e OSError /home/ubuntu/sof-test/logs/multiple-pipeline/2024-07-07-18:21:01-27918/slogger.txt
[    11247799.292636] (          22.239582) c2 pipe         7.40  ......./pipeline-graph.c:357  INFO pipe reset
[    11247895.750965] (          96.458328) c2 pipe         1.3   ......./pipeline-graph.c:333  WARN pipeline_comp_reset(): component is in active state, try to stop it
[    11247921.375964] (          25.624998) c2 pipe         1.3   ......./pipeline-graph.c:336  ERROR pipeline_comp_reset(): failed to recover
[    11248349.240531] (         427.864563) c0 ipc                  src/ipc/ipc3/handler.c:1605 INFO ipc: new cmd 0x60050000
--
[    66761752.555460] (          22.239582) c2 pipe         7.40  ......./pipeline-graph.c:357  INFO pipe reset
[    66761848.961706] (          96.406242) c2 pipe         1.3   ......./pipeline-graph.c:333  WARN pipeline_comp_reset(): component is in active state, try to stop it
[    66761874.638788] (          25.677082) c2 pipe         1.3   ......./pipeline-graph.c:336  ERROR pipeline_comp_reset(): failed to recover
[    66762136.513778] (         261.875000) c0 ipc                  src/ipc/ipc3/handler.c:1605 INFO ipc: new cmd 0x60050000
--
[    80619819.504790] (          22.239582) c2 pipe         7.40  ......./pipeline-graph.c:357  INFO pipe reset
[    80619916.744369] (          97.239578) c2 pipe         1.3   ......./pipeline-graph.c:333  WARN pipeline_comp_reset(): component is in active state, try to stop it
[    80619942.421451] (          25.677082) c2 pipe         1.3   ......./pipeline-graph.c:336  ERROR pipeline_comp_reset(): failed to recover
[    80620367.525601] (         425.104156) c0 ipc                  src/ipc/ipc3/handler.c:1605 INFO ipc: new cmd 0x60050000
--
[   108391922.724142] (          21.927082) c2 pipe         7.40  ......./pipeline-graph.c:357  INFO pipe reset
[   108392019.494971] (          96.770828) c2 pipe         1.3   ......./pipeline-graph.c:333  WARN pipeline_comp_reset(): component is in active state, try to stop it
[   108392045.328304] (          25.833332) c2 pipe         1.3   ......./pipeline-graph.c:336  ERROR pipeline_comp_reset(): failed to recover
[   108392541.109534] (         495.781219) c0 ipc                  src/ipc/ipc3/handler.c:1605 INFO ipc: new cmd 0x60050000
--
[   275067793.757288] (          22.083332) c2 pipe         7.40  ......./pipeline-graph.c:357  INFO pipe reset
[   275067890.319784] (          96.562492) c2 pipe         1.3   ......./pipeline-graph.c:333  WARN pipeline_comp_reset(): component is in active state, try to stop it
[   275067916.101033] (          25.781248) c2 pipe         1.3   ......./pipeline-graph.c:336  ERROR pipeline_comp_reset(): failed to recover
[   275068354.330182] (         438.229156) c0 ipc                  src/ipc/ipc3/handler.c:1605 INFO ipc: new cmd 0x60050000
--
[   344515716.883092] (          22.031250) c2 pipe         7.40  ......./pipeline-graph.c:357  INFO pipe reset
[   344515813.810172] (          96.927078) c2 pipe         1.3   ......./pipeline-graph.c:333  WARN pipeline_comp_reset(): component is in active state, try to stop it
[   344515839.591421] (          25.781248) c2 pipe         1.3   ......./pipeline-graph.c:336  ERROR pipeline_comp_reset(): failed to recover
[   344516252.455987] (         412.864563) c0 ipc                  src/ipc/ipc3/handler.c:1605 INFO ipc: new cmd 0x60050000

ssavati avatar Jul 08 '24 07:07 ssavati

With #9139, this is still one of the two failures hitting several of our models every day in our stable 2.2 IPC3 test plan. Like clockwork.

#9135 seems to affect only NOCODEC configurations. #9139 not just.

marc-hb avatar Jul 26 '24 20:07 marc-hb

Impact only on no_codec topology so correct at P3.

lgirdwood avatar Aug 05 '24 13:08 lgirdwood

Maybe we can "reduce" these two NOCODEC topologies a little bit to silence the constant and distracting failure noise?

marc-hb avatar Aug 13 '24 17:08 marc-hb

Maybe we can "reduce" these two NOCODEC topologies a little bit to silence the constant and distracting failure noise?

Yes - for stable 2.2 we should only be testing kernel updates on real user configurations. Lets disable and not delete nocodec, as it could be useful as a means to debug.

lgirdwood avatar Aug 14 '24 14:08 lgirdwood

SKIP submitted:

  • https://github.com/thesofproject/sof-test/pull/1229

marc-hb avatar Aug 15 '24 03:08 marc-hb

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

github-actions[bot] avatar Jun 19 '25 03:06 github-actions[bot]