nuttx icon indicating copy to clipboard operation
nuttx copied to clipboard

sched/signal: Add support to disable signals

Open wangchdo opened this issue 1 month ago • 44 comments

Summary

Depends-on: https://github.com/apache/nuttx-apps/pull/3217 (apps updated to disabled signals).

Currently, NuttX is being adopted increasingly on small embedded systems with tight resource constraints. Many of these systems do not require signal handling functionality or POSIX-related APIs.

With #17200 introducing scheduled sleep support, and #17204 replacing all signal-based sleep implementations in drivers and the filesystem with scheduled sleep, the dependency on signals has been significantly reduced.

Therefore, it is now a good time to provide an option for users to disable signal support entirely, allowing them to reduce code size and memory usage when signal functionality is not needed.

Impact

Add configuration support that allows users to disable signal functionality.

When signals are disabled, the related POSIX APIs—including sleep, usleep, kill, pkill, and pthread—will be disabled as well.

Testing

ostest is heavily dependent on POSIX APIs—including sleep, usleep, kill, pkill, and pthread. Therefore, to disable signal support, ostest must also be disabled until it is refactored to reduce its reliance on these APIs.

I have tested the signal-disable option on the fvp-armv8r-aarch32 board with ostest disabled, and the build succeeds and runs correctly.


NuttShell (NSH)
nsh> [ 0] Idle_Task: nx_start: CPU0: Beginning Idle Loop
nsh> 
nsh> uname -a
NuttX 0.0.0 cce68d3ead-dirty Nov 20 2025 14:43:39 arm fvp-armv8r-aarch32
nsh> 
nsh> 
nsh> help
help usage:  help [-v] [<cmd>]

    .           cd          exec        ls          pwd         truncate    
    [           cp          exit        mkdir       rm          uname       
    ?           cmp         expr        mkrd        rmdir       umount      
    alias       dirname     false       mount       set         unset       
    unalias     df          fdinfo      mv          source      uptime      
    basename    dmesg       free        pidof       test        xd          
    break       echo        help        printf      time        
    cat         env         hexdump     ps          true        

Builtin Apps:
    dd       nsh      sh       hello    
nsh> 
nsh> 
nsh> hello
[ 2] nsh_main: task_spawn: name=hello entry=0x2f554 file_actions=0x20009a88 attr=0x20009a8c argv=0x20009b48
[ 2] nsh_main: spawn_execattrs: Setting policy=2 priority=100 for pid=3
[ 2] nsh_main: nxtask_activate: hello pid=3,TCB=0x2000a0a0
Hello, World!!
[ 3] hello: nxtask_exit: hello pid=3,TCB=0x2000a0a0
nsh> 
nsh> ps
  TID   PID  PPID PRI POLICY   TYPE    NPX STATE    EVENT     SIGMASK            STACK    USED FILLED COMMAND
    0     0     0   0 FIFO     Kthread   - Ready                       0008176 0000888  10.8%  Idle_Task
    1     0     0 192 RR       Kthread   - Waiting  Semaphore          0008128 0000432   5.3%  hpwork 0x200002e0 0x20000330
    2     2     0 100 RR       Task      - Running                     0008152 0001840  22.5%  nsh_main
nsh> 

wangchdo avatar Nov 20 '25 06:11 wangchdo

you can't disable signals under POSIX. As much as I like this change and minimizing the footprint of NuttX, this change is against INVIOLABLES.md and it certainly can't be merged without more discussion in the community.

OK. However, while disabling signals isn’t a major change, it can reduce NuttX’s footprint a lot. This is very helpful for many use cases that do not rely on NuttX’s POSIX features.

wangchdo avatar Nov 20 '25 08:11 wangchdo

you can't disable signals under POSIX. As much as I like this change and minimizing the footprint of NuttX, this change is against INVIOLABLES.md and it certainly can't be merged without more discussion in the community.

@raiden00pl

I like this PR because our internal branch has already decoupled the file system and signal functionalities, which is very useful for reducing memory consumption and improving performance.

DISABLE_SIGNALS in previous discussions: https://github.com/apache/nuttx/issues/11390

As described in this PR, Nuttx couldn't decouple its sleep system from signals in the previous version, which led @gregory-nutt to remove support for DISABLE_SIGNAL directly. After commits #17200 and #17204, signals are no longer a mandatory option for the system. In some lightweight POSIX scenarios, they can avoid signal checks in the kernel scheduling flow, playing a crucial role in performance improvement.

anchao avatar Nov 20 '25 09:11 anchao

@wangchdo

Some SIG APIs can use empty definitions, which reduces the number of macro definitions in the code: image image

Another key change is that the usleep/sleep function in the kernel needs to be replaced with nxsched_usleep/sleep.

image

anchao avatar Nov 20 '25 09:11 anchao

@wangchdo

Some SIG APIs can use empty definitions, which reduces the number of macro definitions in the code: image image

Another key change is that the usleep/sleep function in the kernel needs to be replaced with nxsched_usleep/sleep.

In fact, I am planning to do so.

image

wangchdo avatar Nov 20 '25 09:11 wangchdo

@anchao

I understand the benefits of disabling signals and I like these, but as I mentioned, it's not POSIX-compliant. PSE51 also requires signal support, even if they make no practical sense in this case.

On the one hand, we want to remove time_t as 32-bit (https://github.com/apache/nuttx/pull/14460), which is POSIX-complaint, and on the other, we want to disable signals which is POSIX-incompatible. I see some inconsistency here. It would be good to establish some rules about when we can accept breaking POSIX and when we can't. I think we have a condition for this in INVIOLABLES.md, but it is not precise:

https://github.com/apache/nuttx/blob/fd70e5f947cb38f83503cc57b897e9cfb8368d9f/INVIOLABLES.md?plain=1#L29-L30

In this case voting on the mailing list would be a good idea.

BTW, do you have any comparisons between footprint and performers that could be presented as an argument for this change?

raiden00pl avatar Nov 20 '25 09:11 raiden00pl

@anchao

I understand the benefits of disabling signals and I like these, but as I mentioned, it's not POSIX-compliant. PSE51 also requires signal support, even if they make no practical sense in this case.

On the one hand, we want to remove time_t as 32-bit (#14460), which is POSIX-complaint, and on the other, we want to disable signals which is POSIX-incompatible. I see some inconsistency here. It would be good to establish some rules about when we can accept breaking POSIX and when we can't. I think we have a condition for this in INVIOLABLES.md, but it is not precise:

https://github.com/apache/nuttx/blob/fd70e5f947cb38f83503cc57b897e9cfb8368d9f/INVIOLABLES.md?plain=1#L29-L30

Agree with you. You previously discussed how NuttX provides switches for other kernel features. These capabilities allow developers to achieve better memory footprint and performance while adhering to lightweight POSIX requirements, which I think is worthwhile: https://github.com/apache/nuttx/blob/master/sched/Kconfig#L6-L47

In this case voting on the mailing list would be a good idea.

Yes, this requires a vote, and especially requires hearing @gregory-nutt voice.

BTW, do you have any comparisons between footprint and performers that could be presented as an argument for this change?

I don't have detailed figures; here are some data from my memory:

  1. Data/bss: Each thread will reduce overhead by ~80 bytes (tcb_s)
  2. Text: Code size will be reduced by 4-7%, (mini shell configuration)
  3. Performance: thread create (3~5%), thread exit (3~5%), context switch event checking (3~5%)

anchao avatar Nov 20 '25 09:11 anchao

A quick comparison of results with this PR for my small systems test configurations from here: https://github.com/railab/railab_nuttx_code/tree/master/boards/arm/stm32/nucleo-f302r8-mini/config

  • mini_core1 config:
Memory region         Used Size  Region Size  %age Used
flash:       18296 B        64 KB     27.92%
sram:        1236 B        16 KB      7.54%

Memory region         Used Size  Region Size  %age Used
flash:       17696 B        64 KB     27.00%
sram:         940 B        16 KB      5.74%
  • mini_core3 config:
Memory region         Used Size  Region Size  %age Used
flash:        7492 B        64 KB     11.43%
sram:         928 B        16 KB      5.66%

Memory region         Used Size  Region Size  %age Used
flash:        6804 B        64 KB     10.38%
sram:         632 B        16 KB      3.86%

raiden00pl avatar Nov 20 '25 11:11 raiden00pl

you can't disable signals under POSIX. As much as I like this change and minimizing the footprint of NuttX, this change is against INVIOLABLES.md and it certainly can't be merged without more discussion in the community.

I think it should depend on some higher POSIX_PE51 or NOT_POSIX_COMPLIANT config or similar. I think this is good to have the possibility to disable signals and even VFS, as we discussed here: https://github.com/apache/nuttx/issues/11390

acassis avatar Nov 20 '25 11:11 acassis

you can't disable signals under POSIX. As much as I like this change and minimizing the footprint of NuttX, this change is against INVIOLABLES.md and it certainly can't be merged without more discussion in the community.

I think it should depend on some higher POSIX_PE51 or NOT_POSIX_COMPLIANT config or similar. I think this is good to have the possibility to disable signals and even VFS, as we discussed here: #11390

Hi @acassis

Good idea! It seems we already have options to disable certain POSIX features. Do you think we should take all of these into account when introducing the POSIX_PE51 / NOT_POSIX_COMPLIANT (or similar) configuration you mentioned?

Do you think it would be better to propose a separate PR dedicated to introducing the POSIX_PE51 / NOT_POSIX_COMPLIANT (or similar) configuration?

config DISABLE_POSIX_TIMERS
	bool "Disable POSIX timers"
	default DEFAULT_SMALL
	---help---
		Disable support for the the entire POSIX timer family
		including timer_create(), timer_gettime(), timer_settime(),
		etc.

		NOTE:  This option will also disable getitimer() and
		setitimer() which are not, strictly speaking, POSIX timers.

config DISABLE_PTHREAD
	bool "Disable pthread support"
	default DEFAULT_SMALL

config DISABLE_MQUEUE
	bool "Disable POSIX message queue support"
	default DEFAULT_SMALL

wangchdo avatar Nov 20 '25 12:11 wangchdo

you can't disable signals under POSIX. As much as I like this change and minimizing the footprint of NuttX, this change is against INVIOLABLES.md and it certainly can't be merged without more discussion in the community.

I think it should depend on some higher POSIX_PE51 or NOT_POSIX_COMPLIANT config or similar. I think this is good to have the possibility to disable signals and even VFS, as we discussed here: #11390

Hi @acassis

Good idea! It seems we already have options to disable certain POSIX features. Do you think we should take all of these into account when introducing the POSIX_PE51 / NOT_POSIX_COMPLIANT (or similar) configuration you mentioned?

Do you think it would be better to propose a separate PR dedicated to introducing the POSIX_PE51 / NOT_POSIX_COMPLIANT (or similar) configuration?

config DISABLE_POSIX_TIMERS
	bool "Disable POSIX timers"
	default DEFAULT_SMALL
	---help---
		Disable support for the the entire POSIX timer family
		including timer_create(), timer_gettime(), timer_settime(),
		etc.

		NOTE:  This option will also disable getitimer() and
		setitimer() which are not, strictly speaking, POSIX timers.

config DISABLE_PTHREAD
	bool "Disable pthread support"
	default DEFAULT_SMALL

config DISABLE_MQUEUE
	bool "Disable POSIX message queue support"
	default DEFAULT_SMALL

~~Yes, I think so!~~ Actually we need to understand that some features in POSIX compliant system are optional (like posix timer, pthreads, etc) and other are mandatory (like signals). So maybe these features just need to depend on DEFAULT_SMALL, like it is currently

acassis avatar Nov 20 '25 12:11 acassis

When signals are disabled, the related POSIX APIs—including sleep, usleep, kill, pkill, and pthread—will be disabled as well.

It's too limit that sleep/usleep can't be called when CONFIG_DISABLE_SIGNALS equals true, so I would suggest that this feature should be done by level:

  1. disable all signal related functionality like this pr
  2. disable signal function related to signal handler(callback), but keep other simple but frequnctly used function(e.g. wait/sigwait/ppoll).
  3. enable all signal functionality like before

xiaoxiang781216 avatar Nov 20 '25 12:11 xiaoxiang781216

POSIX PSE5x configuration from Kconfig is not so easy. Here is a summary of the functionality required by POSIX subprofiles: https://nuttx.apache.org/docs/latest/standards/posix.html

Actually we need to understand that some features in POSIX compliant system are optional (like posix timer, pthreads, etc) and other are mandatory (like signals). So maybe these features just need to depend on DEFAULT_SMALL, like it is currently

@acassis I don't think this is true. Posix timers and pthreads are required even for PSE51 (POSIX_THREADS_BASE, _POSIX_TIMERS)

raiden00pl avatar Nov 20 '25 12:11 raiden00pl

When signals are disabled, the related POSIX APIs—including sleep, usleep, kill, pkill, and pthread—will be disabled as well.

It's too limit that sleep/usleep can't be called when CONFIG_DISABLE_SIGNALS equals true, so I would suggest that this feature should be done by level:

  1. disable all signal related functionality like this pr
  2. disable signal function related to signal handler(callback), but keep other simple but frequnctly used function(e.g. wait/sigwait/ppoll).
  3. enable all signal functionality like before

Yes, for example, sleep/usleep can be implemented internally as signal uninterruptible , and all API semantics can remain the same as before, except that it lacks the signal feature.

anchao avatar Nov 20 '25 12:11 anchao

POSIX PSE5x configuration from Kconfig is not so easy. Here is a summary of the functionality required by POSIX subprofiles: https://nuttx.apache.org/docs/latest/standards/posix.html

Actually we need to understand that some features in POSIX compliant system are optional (like posix timer, pthreads, etc) and other are mandatory (like signals). So maybe these features just need to depend on DEFAULT_SMALL, like it is currently

So, I think we are near to get it. I just don't understand why POSIX_ADA_LANG_SUPPORT is a requirement? Seems like some ADA lover was in the committee that defined it.

@acassis I don't think this is true. Posix timers and pthreads are required even for PSE51 (POSIX_THREADS_BASE, _POSIX_TIMERS)

According to https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/threads.h.html some subprofiles can define it optional.

Update: you are right, what is option is thread.h

acassis avatar Nov 20 '25 13:11 acassis

@acassis the best approach is to see what it's like in Zephyr: https://github.com/zephyrproject-rtos/zephyr/blob/main/lib/posix/Kconfig.profile They have much greater resources than us and have probably done appropriate research :)

raiden00pl avatar Nov 20 '25 13:11 raiden00pl

Thanks @wangchdo I did take a look at the changes, my questions below :-)

When signals are disabled, the related POSIX APIs—including sleep, usleep, kill, pkill, and pthread—will be disabled as well.

  • Are you sure these are the only impacted places? Drivers / irq / read / write / etc is not impacted?
  • Are you sure that this change will not leave user in almost bare-metal state?
  • For now I can see that in Kconfig signals can be disabled. I cannot see dependency in sleep/usleep/kill/pkill/pthread/etc on signals in Kconfig.
  • So the program will crash when for instance usleep is called but signals are disabled, or wont build, right?
  • We need to sort the dependencies out before merge and/or we need to clearly know what other functionalities are impacted.
  • Disabling signals in Kconfig should also disable impacted functionalities automatically I see that as most user friendly and safe approach.

ostest is heavily dependent on POSIX APIs—including sleep, usleep, kill, pkill, and pthread. Therefore, to disable signal support, ostest must also be disabled until it is refactored to reduce its reliance on these APIs.

  • How much work is needed to update ostest to work without signals and other impacted functionalitites?
  • This is our main tool of runtime confirmation that all works as expected.
  • Having working ostest without signals would reveal potential problems in other areas than need an update.
  • Would it be possible to update ostest before this PR is merged?

Disabling the signal mechanism significantly reduces system footprint and complexity. It can also improve real-time performance during thread creation and context switching.

  • Would it be possible to provide measured Flash/Ram usage with and without signals disabled?
  • Would it be possible to measure runtime code improvement?

This functionality deserves a mention in the documentation :-)

  • https://nuttx.apache.org/docs/latest/reference/user/07_signals.html
  • https://nuttx.apache.org/docs/latest/guides/signaling_sem_priority_inheritance.html
  • https://nuttx.apache.org/docs/latest/standards/posix.html#posix-realtime-signals
  • ...

cederom avatar Nov 21 '25 01:11 cederom

Thanks @wangchdo I did take a look at the changes, my questions below :-)

When signals are disabled, the related POSIX APIs—including sleep, usleep, kill, pkill, and pthread—will be disabled as well.

  • Are you sure these are the only impacted places? Drivers / irq / read / write / etc is not impacted?

Yes, please check my analysis below:

This PR introduces an option for users to disable the signal subsystem. The effects of disabling signals can be summarized in two major aspects:

1. Source Files That Will Not Be Compiled When CONFIG_DISABLE_SIGNALS=y, the following files will be excluded from the build:

libs/libc/signal/*.c
libs/libc/unistd/lib_sleep.c, libs/libc/unistd/lib_usleep.c
sched/signal/*.c
sched/group/group_signal.c

2. Behavioral Changes in Code That Remains Compilable When CONFIG_DISABLE_SIGNALS=y, the following functional paths will be modified or bypassed when signals are disabled:

  1. posix_spawnattr_init() no longer initializes an empty signal mask
  /* Empty signal mask */
#ifndef CONFIG_DISABLE_SIGNALS
  sigemptyset(&attr->sigmask);
#endif
  1. group_release() will not release pending signals
  /* Release pending signals */
#ifndef CONFIG_DISABLE_SIGNALS
  nxsig_release(group);
#endif

  1. nx_start() will not initialize the signal facility
  /* Initialize the signal facility (if in link) */
#ifndef CONFIG_DISABLE_SIGNALS
  nxsig_initialize();
#endif
  1. nx_pthread_exit() will skip signal-related cleanup
#ifndef CONFIG_DISABLE_SIGNALS
  sigfillset(&set);
  nxsig_procmask(SIG_SETMASK, &set, NULL);
#endif
  1. nxtask_exithook() will not clean up signal queues
#ifndef CONFIG_DISABLE_SIGNALS
  nxsig_cleanup(tcb); /* Deallocate Signal lists */
#endif
  1. nxtask_reset_task() will not reset signal queues and masks
#ifndef CONFIG_DISABLE_SIGNALS
  nxsig_cleanup(tcb);             /* Deallocate Signal lists */
  sigemptyset(&tcb->sigprocmask); /* Reset sigprocmask */
#endif
  1. nxthread_setup_scheduler() will no longer inherit the parent’s signal mask
#ifndef CONFIG_DISABLE_SIGNALS
      /* exec(), pthread_create(), task_create(), and vfork() all
       * inherit the signal mask of the parent thread.
       */

      tcb->sigprocmask = this_task()->sigprocmask;
#endif
  1. spawn_execattrs() will not apply signal-mask attributes
#ifndef CONFIG_DISABLE_SIGNALS
  if ((attr->flags & POSIX_SPAWN_SETSIGMASK) != 0)
    {
      FAR struct tcb_s *tcb = nxsched_get_tcb(pid);
      if (tcb)
        {
          tcb->sigprocmask = attr->sigmask;
        }
    }
#endif
  1. timer_signotify() will not send signal notifications
#ifndef CONFIG_DISABLE_SIGNALS
#  ifdef CONFIG_SIG_EVTHREAD
  DEBUGVERIFY(nxsig_notification(timer->pt_owner, &timer->pt_event,
                                 SI_TIMER, &timer->pt_work));
#  else
  DEBUGVERIFY(nxsig_notification(timer->pt_owner, &timer->pt_event,
                                 SI_TIMER, NULL));
#  endif
#endif

Impact Analysis Driver, IRQ, read/write, and similar subsystems are not affected:

  • Their implementations are unchanged.
  • If they relied on code that is no longer compiled, the link stage would fail — which does not occur.
  • If they depend on code paths that are conditionally updated, the behavior remains correct, because these updates mainly affect thread/task initialization, setup, and exit paths, and do not influence driver or IRQ execution.

Additional Background For reference:

  • PR #17200 introduces scheduled-sleep support.
  • PR #17204 replaces all signal-based sleep implementations in drivers and the filesystem with scheduled sleep.

These changes significantly reduce the dependency on signals across the system, making the signal-disable option more feasible and less intrusive.

  • Are you sure that this change will not leave user in almost bare-metal state?

Please review my conclusion about the detailed effects of this change above. I am confident your concern will not occur.

  • For now I can see that in Kconfig signals can be disabled. I cannot see dependency in sleep/usleep/kill/pkill/pthread/etc on signals in Kconfig.

Yes this Kconfig relations should be imroved

  • So the program will crash when for instance usleep is called but signals are disabled, or wont build, right?

If a program calls sleep or usleep, the build will fail when signals are disabled. If the program truly requires sleep functionality, there are two options:

Option 1

Since signals basically have two major purposes:

  • Invoking the signal handler in the target process/task
  • Waking up a target thread (similar to a semaphore post)

So I can introduce a finer-grained configuration option so that only signal handler functionality is disabled while preserving the wake-up mechanism. This would allow sleep and usleep to remain available.

Option 2

Since PR #17200 introduces scheduled-sleep support and PR #17204 replaces all signal-based sleep implementations in drivers and the filesystem with scheduled sleep, users can switch to the scheduled-sleep APIs as an alternative for sleep/usleep.

  • We need to sort the dependencies out before merge and/or we need to clearly know what other functionalities are impacted.

Yes I agree

  • Disabling signals in Kconfig should also disable impacted functionalities automatically I see that as most user friendly and safe approach.

Yes I agree

ostest is heavily dependent on POSIX APIs—including sleep, usleep, kill, pkill, and pthread. Therefore, to disable signal support, ostest must also be disabled until it is refactored to reduce its reliance on these APIs.

  • How much work is needed to update ostest to work without signals and other impacted functionalitites?
  • This is our main tool of runtime confirmation that all works as expected.
  • Having working ostest without signals would reveal potential problems in other areas than need an update.
  • Would it be possible to update ostest before this PR is merged?

Of course, I can do this

Disabling the signal mechanism significantly reduces system footprint and complexity. It can also improve real-time performance during thread creation and context switching.

  • Would it be possible to provide measured Flash/Ram usage with and without signals disabled?
  • Would it be possible to measure runtime code improvement?

I will give details soon

This functionality deserves a mention in the documentation :-)

  • https://nuttx.apache.org/docs/latest/reference/user/07_signals.html
  • https://nuttx.apache.org/docs/latest/guides/signaling_sem_priority_inheritance.html
  • https://nuttx.apache.org/docs/latest/standards/posix.html#posix-realtime-signals
  • ...

wangchdo avatar Nov 21 '25 03:11 wangchdo

When signals are disabled, the related POSIX APIs—including sleep, usleep, kill, pkill, and pthread—will be disabled as well.

It's too limit that sleep/usleep can't be called when CONFIG_DISABLE_SIGNALS equals true, so I would suggest that this feature should be done by level:

  1. disable all signal related functionality like this pr
  2. disable signal function related to signal handler(callback), but keep other simple but frequnctly used function(e.g. wait/sigwait/ppoll).
  3. enable all signal functionality like before

Upon reconsideration, I prefer to only allow disabling all signal-related functionality, and re-implement the signal-dependent functions such as sleep()/usleep() using the newly added scheduler-based sleep APIs.

  1. This approach is clearer and safer. If we allow disabling only part of the signal subsystem, we would need to modify the implementations of the remaining signal functions. At the same time, we cannot guarantee that those functions will continue to behave exactly as before. Even worse, it becomes harder for users to understand the actual impact of partially disabling signal features.

  2. With PR #17200 introducing scheduler-based sleep support, and PR #17204 replacing all signal-based sleep implementations in drivers and the filesystem with scheduler-based versions, the overall dependency on signals has already been significantly reduced.

  3. We can re-implement the libc sleep()/usleep() functions using the new API added by PR #17368. This will even be more lightweight compared to the current signal-based wait mechanism.

wangchdo avatar Nov 24 '25 01:11 wangchdo

When signals are disabled, the related POSIX APIs—including sleep, usleep, kill, pkill, and pthread—will be disabled as well.

It's too limit that sleep/usleep can't be called when CONFIG_DISABLE_SIGNALS equals true, so I would suggest that this feature should be done by level:

  1. disable all signal related functionality like this pr
  2. disable signal function related to signal handler(callback), but keep other simple but frequnctly used function(e.g. wait/sigwait/ppoll).
  3. enable all signal functionality like before

Upon reconsideration, I prefer to only allow disabling all signal-related functionality, and re-implement the signal-dependent functions such as sleep()/usleep() using the newly added scheduler-based sleep APIs.

1. This approach is clearer and safer. If we allow disabling only part of the signal subsystem, we would need to modify the implementations of the remaining signal functions. At the same time, we cannot guarantee that those functions will continue to behave exactly as before. Even worse, it becomes harder for users to understand the actual impact of partially disabling signal features.

2. With PR [sched/sleep: add support for scheduling sleep #17200](https://github.com/apache/nuttx/pull/17200)  introducing scheduler-based sleep support, and PR [sched/sleep: replace all Signal-based sleep implement to Scheduled sleep #17204](https://github.com/apache/nuttx/pull/17204)  replacing all signal-based sleep implementations in drivers and the filesystem with scheduler-based versions, the overall dependency on signals has already been significantly reduced.

3. We can re-implement the libc sleep()/usleep() functions using the new API added by PR [sched/sleep: Add nxched_nanosleep() API #17368](https://github.com/apache/nuttx/pull/17368). This will even be more lightweight compared to the current signal-based wait mechanism.

Thank you @wangchdo :-)

  • I also think that when Signals are not here that means signals are not here (not just part of the signals).
  • Would it be possible that you implement sleep alternatives that would not depend on signals? That way we could compare full implementation on a working prototype, make some testing and measurements. Then decide if we move forward with that approach.
  • Would alternative sleep implementation also work when signals are enabled? Or two implementations would have to co-exist one with for signals and the other without signals?
  • What are risks and benefits here?
    • backward compatibility?
    • portability?
    • timing precision?
    • code size?
    • others?

cederom avatar Nov 24 '25 03:11 cederom

When signals are disabled, the related POSIX APIs—including sleep, usleep, kill, pkill, and pthread—will be disabled as well.

It's too limit that sleep/usleep can't be called when CONFIG_DISABLE_SIGNALS equals true, so I would suggest that this feature should be done by level:

  1. disable all signal related functionality like this pr
  2. disable signal function related to signal handler(callback), but keep other simple but frequnctly used function(e.g. wait/sigwait/ppoll).
  3. enable all signal functionality like before

Upon reconsideration, I prefer to only allow disabling all signal-related functionality, and re-implement the signal-dependent functions such as sleep()/usleep() using the newly added scheduler-based sleep APIs.

not only sleep/usleep, my real concern is the following functions:

pid_t wait(FAR int *stat_loc);
int   waitid(idtype_t idtype, id_t id, FAR siginfo_t *info, int options);
pid_t waitpid(pid_t pid, FAR int *stat_loc, int options);

These functions is important to make nsh work correctly.

  1. This approach is clearer and safer. If we allow disabling only part of the signal subsystem, we would need to modify the implementations of the remaining signal functions.

But we just need skip the signal dispatch, no other change.

At the same time, we cannot guarantee that those functions will continue to behave exactly as before. Even worse, it becomes harder for users to understand the actual impact of partially disabling signal features.

The rule is simple, signal action doesn't work, but all other signal related feature work as before.

xiaoxiang781216 avatar Nov 24 '25 05:11 xiaoxiang781216

When signals are disabled, the related POSIX APIs—including sleep, usleep, kill, pkill, and pthread—will be disabled as well.

It's too limit that sleep/usleep can't be called when CONFIG_DISABLE_SIGNALS equals true, so I would suggest that this feature should be done by level:

  1. disable all signal related functionality like this pr
  2. disable signal function related to signal handler(callback), but keep other simple but frequnctly used function(e.g. wait/sigwait/ppoll).
  3. enable all signal functionality like before

Upon reconsideration, I prefer to only allow disabling all signal-related functionality, and re-implement the signal-dependent functions such as sleep()/usleep() using the newly added scheduler-based sleep APIs.

not only sleep/usleep, my real concern is the following functions:

pid_t wait(FAR int *stat_loc);
int   waitid(idtype_t idtype, id_t id, FAR siginfo_t *info, int options);
pid_t waitpid(pid_t pid, FAR int *stat_loc, int options);

These functions is important to make nsh work correctly.

  1. This approach is clearer and safer. If we allow disabling only part of the signal subsystem, we would need to modify the implementations of the remaining signal functions.

But we just need skip the signal dispatch, no other change.

At the same time, we cannot guarantee that those functions will continue to behave exactly as before. Even worse, it becomes harder for users to understand the actual impact of partially disabling signal features.

The rule is simple, signal action doesn't work, but all other signal related feature work as before.

OK. If we disable only the signal action feature while keeping the rest of the signal subsystem intact, we still need to retain sigset_t sigprocmask, sigset_t sigwaitmask, and siginfo_t *sigunbinfo in struct tcb_s, and only remove sq_queue_t sigpendactionq and sq_queue_t sigpostedq. The remaining fields (sigprocmask, sigwaitmask, and sigunbinfo) occupy about 20 bytes in total, whereas sigpendactionq and sigpostedq together take about 16 bytes.

In addition, selectively disabling only the signal action functionality would require refactoring the implementation to clearly separate signal-action logic from the rest of the signal subsystem, which increases overall complexity.

But it may still be worthwhile, since disabling signal actions only can also improve task initialization/creation/switch/exit performance and reduce task stack usage.

I will look into the details and evaluate the pros and cons.

wangchdo avatar Nov 24 '25 06:11 wangchdo

When signals are disabled, the related POSIX APIs—including sleep, usleep, kill, pkill, and pthread—will be disabled as well.

It's too limit that sleep/usleep can't be called when CONFIG_DISABLE_SIGNALS equals true, so I would suggest that this feature should be done by level:

  1. disable all signal related functionality like this pr
  2. disable signal function related to signal handler(callback), but keep other simple but frequnctly used function(e.g. wait/sigwait/ppoll).
  3. enable all signal functionality like before

Upon reconsideration, I prefer to only allow disabling all signal-related functionality, and re-implement the signal-dependent functions such as sleep()/usleep() using the newly added scheduler-based sleep APIs.

1. This approach is clearer and safer. If we allow disabling only part of the signal subsystem, we would need to modify the implementations of the remaining signal functions. At the same time, we cannot guarantee that those functions will continue to behave exactly as before. Even worse, it becomes harder for users to understand the actual impact of partially disabling signal features.

2. With PR [sched/sleep: add support for scheduling sleep #17200](https://github.com/apache/nuttx/pull/17200)  introducing scheduler-based sleep support, and PR [sched/sleep: replace all Signal-based sleep implement to Scheduled sleep #17204](https://github.com/apache/nuttx/pull/17204)  replacing all signal-based sleep implementations in drivers and the filesystem with scheduler-based versions, the overall dependency on signals has already been significantly reduced.

3. We can re-implement the libc sleep()/usleep() functions using the new API added by PR [sched/sleep: Add nxched_nanosleep() API #17368](https://github.com/apache/nuttx/pull/17368). This will even be more lightweight compared to the current signal-based wait mechanism.

Thank you @wangchdo :-)

  • I also think that when Signals are not here that means signals are not here (not just part of the signals).
  • Would it be possible that you implement sleep alternatives that would not depend on signals? That way we could compare full implementation on a working prototype, make some testing and measurements. Then decide if we move forward with that approach.

Yes, I plan to implement sleep alternatives with #17368

  • Would alternative sleep implementation also work when signals are enabled? Or two implementations would have to co-exist one with for signals and the other without signals?

I think the version without signals are better

  • What are risks and benefits here?

    • backward compatibility?
    • portability?
    • timing precision?
    • code size?
    • others?

I will show these data later

wangchdo avatar Nov 24 '25 07:11 wangchdo

I guess https://github.com/apache/nuttx-apps/pull/3217 by @extinguish is a prototype on @xiaoxiang781216 idea? :-)

cederom avatar Nov 24 '25 07:11 cederom

When signals are disabled, the related POSIX APIs—including sleep, usleep, kill, pkill, and pthread—will be disabled as well.

It's too limit that sleep/usleep can't be called when CONFIG_DISABLE_SIGNALS equals true, so I would suggest that this feature should be done by level:

  1. disable all signal related functionality like this pr
  2. disable signal function related to signal handler(callback), but keep other simple but frequnctly used function(e.g. wait/sigwait/ppoll).
  3. enable all signal functionality like before

Upon reconsideration, I prefer to only allow disabling all signal-related functionality, and re-implement the signal-dependent functions such as sleep()/usleep() using the newly added scheduler-based sleep APIs.

not only sleep/usleep, my real concern is the following functions:

pid_t wait(FAR int *stat_loc);
int   waitid(idtype_t idtype, id_t id, FAR siginfo_t *info, int options);
pid_t waitpid(pid_t pid, FAR int *stat_loc, int options);

These functions is important to make nsh work correctly.

  1. This approach is clearer and safer. If we allow disabling only part of the signal subsystem, we would need to modify the implementations of the remaining signal functions.

But we just need skip the signal dispatch, no other change.

At the same time, we cannot guarantee that those functions will continue to behave exactly as before. Even worse, it becomes harder for users to understand the actual impact of partially disabling signal features.

The rule is simple, signal action doesn't work, but all other signal related feature work as before.

OK. If we disable only the signal action feature while keeping the rest of the signal subsystem intact, we still need to retain sigset_t sigprocmask, sigset_t sigwaitmask, and siginfo_t *sigunbinfo in struct tcb_s, and only remove sq_queue_t sigpendactionq and sq_queue_t sigpostedq. The remaining fields (sigprocmask, sigwaitmask, and sigunbinfo) occupy about 20 bytes in total, whereas sigpendactionq and sigpostedq together take about 16 bytes.

Yes, this is a compromise if we want to make the most POSIX application/driver work as before. In very simple case, the full disable may work well, but the partial disable may work better in many normal case.

In addition, selectively disabling only the signal action functionality would require refactoring the implementation to clearly separate signal-action logic from the rest of the signal subsystem, which increases overall complexity.

#17357 already finish the work, it isn't complex than the full disable: https://github.com/apache/nuttx/pull/17357/commits/a5678552cc2bfec8113e7f51b621b860b1b8d9d7

But it may still be worthwhile, since disabling signal actions only can also improve task initialization/creation/switch/exit performance and reduce task stack usage.

here is the code save:


1. has passed ostest on CONFIG_DISABLE_SIGNALS=y
2. Here are the test results from our ARMv7-A platform:

When CONFIG_DISABLE_SIGNALS=n:
Binary size = 1,295,424 bytes, Used RAM = 37,980 bytes

When CONFIG_DISABLE_SIGNALS=y:
Binary size = 1,262,624 bytes, Used RAM = 37,852 bytes

This shows a reduction of 32,800 bytes in binary size and 128 bytes in RAM usage.

I will look into the details and evaluate the pros and cons.

xiaoxiang781216 avatar Nov 24 '25 08:11 xiaoxiang781216

When signals are disabled, the related POSIX APIs—including sleep, usleep, kill, pkill, and pthread—will be disabled as well.

It's too limit that sleep/usleep can't be called when CONFIG_DISABLE_SIGNALS equals true, so I would suggest that this feature should be done by level:

  1. disable all signal related functionality like this pr
  2. disable signal function related to signal handler(callback), but keep other simple but frequnctly used function(e.g. wait/sigwait/ppoll).
  3. enable all signal functionality like before

Upon reconsideration, I prefer to only allow disabling all signal-related functionality, and re-implement the signal-dependent functions such as sleep()/usleep() using the newly added scheduler-based sleep APIs.

1. This approach is clearer and safer. If we allow disabling only part of the signal subsystem, we would need to modify the implementations of the remaining signal functions. At the same time, we cannot guarantee that those functions will continue to behave exactly as before. Even worse, it becomes harder for users to understand the actual impact of partially disabling signal features.

2. With PR [sched/sleep: add support for scheduling sleep #17200](https://github.com/apache/nuttx/pull/17200)  introducing scheduler-based sleep support, and PR [sched/sleep: replace all Signal-based sleep implement to Scheduled sleep #17204](https://github.com/apache/nuttx/pull/17204)  replacing all signal-based sleep implementations in drivers and the filesystem with scheduler-based versions, the overall dependency on signals has already been significantly reduced.

3. We can re-implement the libc sleep()/usleep() functions using the new API added by PR [sched/sleep: Add nxsched_nanosleep() API #17368](https://github.com/apache/nuttx/pull/17368). This will even be more lightweight compared to the current signal-based wait mechanism.

Yes, these points your raised makes total sense.

acassis avatar Nov 24 '25 10:11 acassis

Yes, this is a compromise if we want to make the most POSIX application/driver work as before. In very simple case, the full disable may work well, but the partial disable may work better in many normal case.

  1. The stage you described is just phase 1, because many signal-related APIs are retained and can be called normally, but signal functionality is missing. The changes in this PR will completely disable signal support.
  2. You have made numerous kernel modifications, yet no one is willing to submit them to the community. Every time @wangchdo contributes valuable code, you instruct individuals from Xiaomi to submit code with identical functionality. Similar issues exist with the hwtimer #17065 and current PR. Why do you consistently challenge individual developers' submissions instead of assisting them in merging their work into the community? This is unfair—not just to individual developers, but to the entire developer community.

anchao avatar Nov 24 '25 13:11 anchao

  1. You have made numerous kernel modifications, yet no one is willing to submit them to the community. Every time @wangchdo contributes valuable code, you instruct individuals from Xiaomi to submit code with identical functionality. Similar issues exist with the hwtimer sched/hrtimer: add a high resolution timer module in sched and tricore porting to support hard real time cases #17065

I already point out the key problem of https://github.com/apache/nuttx/pull/17065, which must be fixed before I can approve it. do you mean @Fix-Point 's pr: https://github.com/apache/nuttx/pull/17312, https://github.com/apache/nuttx/pull/17316, https://github.com/apache/nuttx/pull/17339 and https://github.com/apache/nuttx/pull/17338? which is totally different from https://github.com/apache/nuttx/pull/17065.

and current PR.

Do you review https://github.com/apache/nuttx/pull/17352 and https://github.com/apache/nuttx/pull/17357 carefully? The detail is totally different as I already mention many time, so I don't repeat the difference here again. Both approach has the pron. and cron., so I suggest to accept both by disabling the signal by level.

Why do you consistently challenge individual developers' submissions instead of assisting them in merging their work into the community? This is unfair—not just to individual developers, but to the entire developer community.

I review @wangchdo 's patch carefully and point out the place which need improve, once the problem get fixed, I always approve and his change. could you point out which patch I block it after the comment get addressed?

xiaoxiang781216 avatar Nov 24 '25 13:11 xiaoxiang781216

Do you review #17352 and #17357 carefully? The detail is totally different as I already mention many time, so I don't repeat the difference here again. Both approach has the pron. and cron., so I suggest to accept both by disabling the signal by level.

Of course, I am very familiar with the implementation of signal. In fact, I made similar changes in my internal branch and applied them to the real product. Therefore, after wangcd submitted the PR, I did not submit my part, but instead encouraged his commit to be successfully merged.

BTW I dare say that if wangcd hadn't submitted this commit, you guys wouldn't have submitted it either, right?

anchao avatar Nov 24 '25 13:11 anchao

To reiterate, the current implementation of signals is highly unsafe because it borrows the context of the interrupted thread in its delivery logic. If a lock is held in the signal context, a serious bug will occur, which is why we prohibit the use of signals.

anchao avatar Nov 24 '25 13:11 anchao

As @raiden00pl mentioned, many POSIX capabilities actually rely on the MMU implementation; for example, fork/signal cannot be fully replicated on devices without virtual addresses.

anchao avatar Nov 24 '25 13:11 anchao