nuttx icon indicating copy to clipboard operation
nuttx copied to clipboard

[BUG] Xtensa ostest stopped working when SMP enabled

Open eren-terzioglu opened this issue 7 months ago • 2 comments

Description / Steps to reproduce the issue

Xtensa devices when SMP support enabled started to fail after https://github.com/apache/nuttx/pull/16194. To reproduce error use this commands to build:

make distclean &&
./tools/configure.sh esp32s3-devkit:smp &&
make -j &&
make download ESPTOOL_PORT=/dev/ttyUSB0 ESPTOOL_BAUD=921600 ESPTOOL_BINDIR=../esp-bins/

After flashing command I used on nsh:

nsh> ostest
...
user_main: nested signal handler test
signest_test: Starting signal waiter task at priority 101
signest_test: Started waiter_main pid=5
waiter_main: Waiter started
signest_test: Star[CPU0] xtensa_user_panic: User Exception: EXCCAUSE=0000 task: ostest
[CPU0] dump_assert_info: Current Version: NuttX  10.4.0 185cb70eda-dirty May 13 2025 13:12:53 xtensa
[CPU0] dump_assert_info: Assertion failed user panic: at file: :0 task(CPU0): ostest process: ostest 0x42024104
[CPU0] up_dump_register:    PC: 4201f760    PS: 00060a32
[CPU0] up_dump_register:    A0: 8201f7b4    A1: 3fc92bd0    A2: 00000000    A3: 3fc92c20
[CPU0] up_dump_register:    A4: 00000001    A5: 00060a20    A6: 00000000    A7: 00000000
[CPU0] up_dump_register:    A8: 00000001    A9: 3fc92bb0   A10: 3fc8aa74   A11: 00060a20
[CPU0] up_dump_register:   A12: 3fc92d28   A13: 00000000   A14: 00000000   A15: 00060a20
[CPU0] up_dump_register:   SAR: 00000020 CAUSE: 00000000 VADDR: 00000000
[CPU0] up_dump_register:  LBEG: 40056f5c  LEND: 40056f72  LCNT: ffffffff
[CPU0] dump_stackinfo: User Stack:
[CPU0] dump_stackinfo:   base: 0x3fc90d70
[CPU0] dump_stackinfo:   size: 00008112
[CPU0] dump_stackinfo:     sp: 0x3fc92bd0
[CPU0] stack_dump: 0x3fc92bb0: 00060a20 00000003 42025b01 42026398 820245a9 3fc92c00 00000000 3fc92c20
[CPU0] stack_dump: 0x3fc92bd0: 0000002b 36d61600 ffffffff 00000005 3f04000a 00000000 00000000 00000000
[CPU0] stack_dump: 0x3fc92bf0: 8202474a 3fc92c20 00000002 0000000a 00000000 3fc92c10 00000004 00000000
[CPU0] stack_dump: 0x3fc92c10: 820241cf 3fc92c50 3fc8c6d0 00000009 820241cf 3fc92c50 3fc8c6d0 00000009
[CPU0] stack_dump: 0x3fc92c30: 42024388 00000000 00000000 0005c108 82013ec4 3fc92ca0 00000000 3fc90d44
[CPU0] stack_dump: 0x3fc92c50: 00010066 00000000 00000000 00002000 00000066 3fc92ca0 00000000 3fc90d44
[CPU0] stack_dump: 0x3fc92c70: 0000003f 0000000a 000054d0 00000000 3fc8c678 3fc8c684 3fc8c680 0000000a
[CPU0] stack_dump: 0x3fc92c90: 82012078 3fc92ce0 42024104 00000005 00061470 00000001 00000020 0005c108
[CPU0] stack_dump: 0x3fc92cb0: 00005368 0005c108 000054d0 00000000 00000004 3fc8c658 3c043267 3fc90d5e
[CPU0] stack_dump: 0x3fc92cd0: 00000000 3fc92d00 00000000 42024104 3fc90d30 00000000 3fc8c191 00000001
[CPU0] stack_dump: 0x3fc92cf0: 00000000 3fc92d20 00000000 00000000 00000000 00000000 00000000 00000000
[CPU0] stack_dump: 0x3fc92d10: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[CPU0] dump_fatal_info: Dump CPU1: PAUSED
[CPU0] up_dump_register:    PC: 420265a6    PS: 00060420
[CPU0] up_dump_register:    A0: 8201123e    A1: 3fc8e740    A2: 00000000    A3: 3fc89dd0
[CPU0] up_dump_register:    A4: 3fc89f28    A5: 00060020    A6: 3fc8ae48    A7: 00000001
[CPU0] up_dump_register:    A8: 80377dc0    A9: 3fc8e720   A10: 00060020   A11: 3fc8e5d0
[CPU0] up_dump_register:   A12: 00000000   A13: 3fc8ad9c   A14: 3fc89dd0   A15: 00000001
[CPU0] up_dump_register:   SAR: 0000001f CAUSE: 3fc8e780 VADDR: 00000000
[CPU0] up_dump_register:  LBEG: 00000000  LEND: 00000000  LCNT: 00000000
[CPU0] dump_stackinfo: User Stack:
[CPU0] dump_stackinfo:   base: 0x3fc8db90
[CPU0] dump_stackinfo:   size: 00003056
[CPU0] dump_stackinfo:     sp: 0x3fc8e740
[CPU0] stack_dump: 0x3fc8e720: 00000000 00000000 00000000 00000000 00000000 3fc8e760 00000000 00000000
[CPU0] stack_dump: 0x3fc8e740: 00000000 00000000 00000000 00000000 00000000 3fc8e780 00000000 00000000
[CPU0] stack_dump: 0x3fc8e760: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[CPU0] stack_dump: 0x3fc8e780: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[CPU0] dump_tasks:    PID GROUP   CPU PRI POLICY   TYPE    NPX STATE   EVENT      SIGMASK          STACKBASE  STACKSIZE      USED   FILLED    COMMAND
[CPU0] dump_tasks:   ----   ---     0 --- -------- ------- --- ------- ---------- ---------------- 0x3fc8aef4      2048      1100    53.7%    irq
[CPU0] dump_tasks:   ----   ---     1 --- -------- ------- --- ------- ---------- ---------------- 0x3fc8b6f4      2048       204     9.9%    irq
[CPU0] dump_task:       0     0     0   0 FIFO     Kthread -   Assigned           0000000000000000 0x3fc8c700      3056       576    18.8%    CPU0 IDLE
[CPU0] dump_task:       1     0     1   0 FIFO     Kthread -   Running            0000000000000000 0x3fc8db90      3056       864    28.2%    CPU1 IDLE
[CPU0] dump_task:       2     2     0 100 RR       Task    -   Waiting Semaphore  0000000000000000 0x3fc8ed18      3016      1104    36.6%    nsh_main
[CPU0] dump_task:       3     3     0 100 RR       Task    -   Waiting Semaphore  0000000000000000 0x3fc90058      2008      1456    72.5%    ostest
[CPU0] dump_task:       4     4     0 100 RR       Task    -   Running            0000000000000000 0x3fc90d70      8112      1520    18.7%    ostest Arg1 Arg2 Arg3 Arg4
[CPU0] dump_task:       5     4     1 101 RR       pthread -   Waiting Semaphore  0000000000000000 0x3fc92e18      8168      1120    13.7%    ostest 0x420243c8 0
[CPU0] dump_task:       6     4     0 102 RR       pthread -   Waiting Semaphore  0000000000000000 0x3fc94f00      8176       960    11.7%    ostest 0x42024388 0

On which OS does this issue occur?

[OS: Linux]

What is the version of your OS?

Ubuntu 22.04

NuttX Version

Master

Issue Architecture

[Arch: xtensa]

Issue Area

[Area: OS Components]

Host information

No response

Verification

  • [x] I have verified before submitting the report.

eren-terzioglu avatar May 13 '25 11:05 eren-terzioglu

@eren-terzioglu sem/mutex size is changed in this patch, do you update the porting layer with your prebuilt library?

xiaoxiang781216 avatar May 13 '25 13:05 xiaoxiang781216

Hi all, this is still an issue.

I've been debugging it but have found no answers so far (on branch b6f2729) One thing I noticed is enabling CONFIG_DEBUG_SCHED_INFO makes the ostest pass, otherwise it gets stuck on signal handler test.

Logging changing the behavior could be indicative of some race condition, right? Since we are on a SMP environment.

Here's output of ostest with SCHED_INFO enabled:

user_main: signal handler test
sighand_test: Initializing semaphore to 0
sighand_test: Starting waiter task
waiter_main: Waiter started
sighand_test: Started waiter_main pid=54
waiter_main: Unmasking signal 32
waiter_main: Registering signal handler
waiter_main: oact.sigaction=0 oact.sa_flags=0 oact.sa_mask=0000000000000000
waiter_main: Waiting on semaphore
sighand_test: Signaling pid=54 with signo=32 sigvalue=42
[CPU0] nxsig_queue: pid=0x00000036 signo=32 value=42
[CPU0] nxsig_tcbdispatch: TCB=0x3ffe3138 pid=54 signo=32 code=1 value=42 masked=NO
[CPU0] up_schedule_sigaction: tcb=0x3ffe3138, rtcb=0x3ffe0ca8 current_regs=0
[CPU1] xtensa_sig_deliver: rtcb=0x3ffe3138 sigpendactionq.head=0x3ffb14f0
[CPU1] nxsig_deliver: Deliver signal 32 to PID 54
[CPU1] xtensa_sig_deliver: Resuming
[CPU1] nx_pthread_exit: exit_value=0
[CPU1] pthread_completejoin: pid=54 exit_value=0
[CPU1] nxtask_exit: ostest pid=54,TCB=0x3ffe3138
waiter_main: sem_wait() successfully interrupted by signal
waiter_main: done
sighand_test: done

And without it:

user_main: signal handler test
sighand_test: Initializing semaphore to 0
sighand_test: Starting waiter task
sighand_test: Started waiter_main pid=54
waiter_main: Waiter started
waiter_main: Unmasking signal 32
waiter_main: Registering signal handler
waiter_main: oact.sigaction=0 oact.sa_flags=0 oact.sa_mask=0000000000000000
waiter_main: Waiting on semaphore
sighand_test: Signaling pid=54 with signo=32 sigvalue=42
sighand_test: ERROR waiter task did not exit
[CPU0] dump_assert_info: Current Version: NuttX  10.4.0 b6f2729730-dirty Jun 13 2025 15:43:52 xtensa
[CPU0] dump_assert_info: Assertion failed (_Bool)0: at file: sighand.c:318 task(CPU0): ostest process: ostest 0x400f79bc

fdcavalcanti avatar Jun 13 '25 18:06 fdcavalcanti