zephyr test: samples/boards/mec15xxevb_assy6853/power_management failed after commit 5f60164a0fc

Describe the bug test case samples/boards/mec15xxevb_assy6853/power_management failed after commit 5f60164a0fc

Please also mention any information which could help others to understand the problem you're facing:

What target platform are you using? mec15xxevb_assy6853
What have you tried to diagnose or workaround this issue? git bisect and found first bad commit is 5f60164a0fc

To Reproduce Steps to reproduce the behavior:

west build -p always -b mec15xxevb_assy6853 samples/boards/mec15xxevb_assy6853/power_management/
west flash

Expected behavior The test should pass

Impact 1 test case fail in daily test

Logs and console output

[00:00:00.006,927] <wrn> pwrmgmt_test: PM single-thread test started for cycles: 5, logging: 0
[00:00:00.006,958] <inf> pwrmgmt_test: About to enter light sleep
[00:00:01.508,941] <inf> pwrmgmt_test: About to enter deep Sleep
[00:00:02.129,028] <inf> pwrmgmt_test: PM sleep entry latency 0.621 seconds
[00:00:02.129,058] <wrn> pwrmgmt_test: Sleep entry latency is higher than expected
[00:00:04.609,527] <err> os: ***** USAGE FAULT *****
[00:00:04.609,527] <err> os:   Illegal use of the EPSR
[00:00:04.609,558] <err> os: r0/a1:  0x000e9119  r1/a2:  0x00000000  r2/a3:  0x00118e20
[00:00:04.609,588] <err> os: r3/a4:  0xffffffff r12/ip:  0x001186a0 r14/lr:  0x000e5d83
[00:00:04.609,588] <err> os:  xpsr:  0x60000000
[00:00:04.609,619] <err> os: Faulting instruction address (r15/pc): 0x00001080
[00:00:04.609,619] <err> os: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0
[00:00:04.609,649] <err> os: Current thread: 0x118d90 (main)
[00:00:04.665,527] <err> os: Halting system

Environment (please complete the following information):

OS: (e.g. Linux, MacOS, Windows): Ubuntu 20.04
Toolchain (e.g Zephyr SDK, ...) 0.14.1
Commit SHA or Version used 5f60164a0fc

Jul 19 '22 07:07 hunterhu3000

cc @nordic-krch

Jul 19 '22 16:07 mmahadevan108

@nordic-krch It seems commit 5f60164a0fc36484b6 causes this problem. Could you please help to check?

Jul 28 '22 16:07 hunterhu3000

The commit https://github.com/zephyrproject-rtos/zephyr/commit/5f60164a0fc36484b60004310320dfd6f3ef0525 think if there is no activated log backend, the timeout will be set to K_FOREVER. If there is at least 1 log backend activated, the value is 50ms. Before this commit, the value is hard code to 50ms and works well. After this commit, it cannot find any activated backend, and set the value to K_FOREVER and cause the problem. Actually, the UART log backend is just enabled and activated, but the log_backend_uart_api has not enabled the function of "is_ready()". So the system always think the UART backend is not activated. And then set the timeout to the wrong value K_FOREVER

Aug 04 '22 16:08 hunterhu3000

if there is no activated log backend, the timeout will be set to K_FOREVER. If there is at least 1 log backend activated, the value is 50ms.

I am not sure what is the reason for this bug but this statement above is wrong. K_FOREVER is set when all backends which had autostart flag are ready and there is no reason for periodic polling for backend readiness. Also if is_ready is not implemented then log_backend_is_ready returns 0 which means that backend is ready. Thus there should be no need for implementation of the is_ready api if backend initialization is synchronous.

Aug 10 '22 05:08 nordic-krch

So, when the uart backend is enabled and activated, the return value of z_log_init() should be 0, right? It looks like the mask will always be 0, when will it be 1 or non-zero? @nordic-krch

Aug 10 '22 06:08 hunterhu3000

https://github.com/zephyrproject-rtos/zephyr/commit/5f60164a0fc36484b60004310320dfd6f3ef0525 is just a trigger of the problem. The coredump of this test case is:

Remote debugging using 127.0.0.1:1234
0x0011ed78 in z_idle_stacks ()
(gdb) bt
#0  0x0011ed78 in z_idle_stacks ()
#1  0x000e2ab8 in arch_swap (key=<optimized out>) at /home/ztest/zephyrproject/zephyr/arch/arm/core/aarch32/swap.c:58
#2  0x0011ec40 in z_main_stack ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) info thread
  Id   Target Id         Frame
* 1    Thread <main>     0x0011ed78 in z_idle_stacks ()

It shows there is a problem in z_idle_stacks Enlarge the CONFIG_IDLE_STACK_SIZE, the problem is fixed.

Aug 14 '22 09:08 hunterhu3000

zephyr zephyr copied to clipboard

test: samples/boards/mec15xxevb_assy6853/power_management failed after commit 5f60164a0fc

zephyr
zephyr copied to clipboard