linux
linux copied to clipboard
[BUG] delayed mtrace collection
Some mtrace in the test case are expanded to next test case, this is spotted in daily test.
for example, in internal daily test 28474, on any test config(model), the timestamp in mtrace for check-playback-all-formats.sh test case is started from 0, but for its next test case (check-capture-all-formats.sh), the timestamp is the continuation of the previous test case, which is unexpected.
This issue is also found in PR tests.
End of mtrace for check-playback-all-formats.sh on MTLP_RVP_SDW
[ 9.588290] <inf> dma: dma_put: dma_put(), dma = 0x400ed670, sref = 0
[ 9.588923] <inf> ipc: ipc_cmd: rx : 0x12010000|0x0
[ 9.589055] <inf> dma: dma_put: dma_put(), dma = 0x400ed580, sref = Terminated
Start of mtrace for check-capture-all-formats.sh on MTLP_RVP_SDW
01b0U
[ 9.583648] <inf> ll_schedule: zephyr_ll_task_done: num_tasks 1 total_num_tasks 1
[ 9.583660] <inf> ll_schedule: zephyr_domain_unregister: zephyr_domain_unregister domain->type 1 domain->clk 3
[ 9.586620] <inf> ipc: ipc_cmd: rx : 0x13000002|0x1
[ 9.586636] <inf> pipe: pipeline_trigger: pipe:1 0x0 pipe trigger cmd 0
[ 9.586658] <inf> dma_intel_adsp_gpdma: intel_adsp_gpdma_power_off: intel_adsp_gpdma_power_off: dma dma@7c000 power off
[ 9.586753] <inf> pipe: pipeline_trigger: pipe:0 0x0 pipe trigger cmd 0
[ 9.586788] <wrn> copier: copier_comp_trigger: comp:0 0x40000 dai is not ready
[ 9.587661] <inf> ipc: ipc_cmd: rx : 0x46000002|0x3
[ 9.588271] <inf> ipc: ipc_cmd: rx : 0x12000000|0x0
[ 9.588290] <inf> dma: dma_put: dma_put(), dma = 0x400ed670, sref = 0
[ 9.588923] <inf> ipc: ipc_cmd: rx : 0x12010000|0x0
[ 9.589055] <inf> dma: dma_put: dma_put(), dma = 0x400ed580, sref = 0
[ 13.314338] <inf> ipc: ipc_cmd: rx : 0x44000000|0x3060004c
[ 13.314771] <inf> ipc: ipc_cmd: rx : 0x47000000|0x0
[ 0.142388] <inf> ipc: ipc_cmd: rx : 0x44000000|0x31400008
[ 0.142955] <inf> ipc: ipc_cmd: rx : 0x44000000|0x3060004c
cc:
- https://github.com/thesofproject/sof/issues/5521
- https://github.com/thesofproject/sof-test/issues/1046
Thanks @aiChaoSONG good catch.
mtrace-reader.py is stopped and restarted between each test and redirected to a new log file every time. So at first sight this does not look like a bug in sof-test but an issue in the logging system itself.
An unsolved mtrace flushing issue was mentioned in:
- https://github.com/thesofproject/sof/pull/7726
@kv2019i, @plbossart , @lyakh , @RanderWang any insight?
@marc-hb The timestamp is generated by Zephyr OS. If the DSP is not powered off between test cases, the timestamp will continue and will not be reset to zero. So this is expected behaviour.
I suspect there is one/many of following happening:
- DSP crashed in a previous test case and kernel is running with CONFIG_SND_SOC_SOF_DEBUG_RETAIN_DSP_CONTEXT=y so DSP is not powered off anymore (until next system reboot)
- DSP does not reach runtime-pm-suspended state between test cases
DSP crashed in a previous test case and kernel is running with >CONFIG_SND_SOC_SOF_DEBUG_RETAIN_DSP_CONTEXT=y so DSP is not powered off anymore
It should not be the case, because this is also see in PASSED test case, just check any test case in our daily test.
DSP does not reach runtime-pm-suspended state between test cases
This is possible, the test interval between test case is 5 seconds, and in the kernel, the time for DSP to enter D3 is also 5s, let me try to make the test interval bigger.
the time for DSP to enter D3 is also 5s, let me try to make the test interval bigger.
I think having the latter always longer than the former is a good idea in CI.
It should not be enforced on personal devices though :-)
I have made the test case interval 10 seconds, which is far enough for DSP to enter D3, however, I still see the issue. check the internal report with ID:28982.
I suspect there is one/many of following happening:
DSP crashed in a previous test case and kernel is running with CONFIG_SND_SOC_SOF_DEBUG_RETAIN_DSP_CONTEXT=y so DSP is not powered off anymore (until next system reboot)
This sounds like a https://github.com/thesofproject/linux issue
DSP does not reach runtime-pm-suspended state between test cases
This sounds like a https://github.com/thesofproject/sof issue
Either way this is not a sof-test issue. Transferring in 5..., 4..., 3...
I have made the test case interval 10 seconds, which is far enough for DSP to enter D3, however, I still see the issue.
Maybe because of sof_pci_debug=0x1?
(#1087)