nuttx icon indicating copy to clipboard operation
nuttx copied to clipboard

sched/sched: Add high-resolution timer (hrtimer) support to NuttX

Open wangchdo opened this issue 1 week ago • 1 comments

Summary

This PR is a continuation and optimized implementation, replacing the draft PR #17065 proposed two months ago.

Summary of this PR

  • Introduce a high-resolution timer (hrtimer) module to NuttX, providing timers with nanosecond-level resolution.
  • Enable coexistence with the existing NuttX timer facility (wdog):
    1. wdog timers will be driven by an hrtimer instance (g_nxsched_hrtimer) when hrtimer is enabled.
    2. both nontickless and tickless scheduler are supported
  • Provide three new hrtimer APIs:
  1. hrtimer_init() – Initialize a high-resolution timer instance.
  2. hrtimer_start() – Start a high-resolution timer in absolute or relative mode.
  3. hrtimer_cancel() – Cancel a pending high-resolution timer.

Motivation

In hard real-time applications, nanosecond-level control of task activation is essential for certain scenarios—such as motor control—where tick-level precision is simply inadequate.

For these use cases, the main limitation of wdog is its tick-based resolution, which is typically in milliseconds. Although the tick duration can be reduced, configuring it to microsecond or nanosecond granularity is impractical, as this would lead to an interrupt storm in which the CPU becomes saturated by handling an excessive number of tick interrupts.

Therefore, an independent high-resolution timer (hrtimer) is required to support use cases—such as motor control—that demand true high-precision timing.

Additionally, hrtimer uses an RB-tree (red-black tree) for timer management, which is more efficient than the list-based structure used by wdog when dealing with a large number of timer events. This advantage becomes increasingly important in hard real-time systems, such as vehicle control systems.

Why not use wdog for high-resolution timing

wdog is highly coupled with the scheduler and has inherent limitations:

  1. It cannot achieve nanosecond-level resolution due to its tick-based design.
  2. While wdog is lightweight, efficient, and stable for all current NuttX functionality, replacing its list-based data structure with a tree would complicate the implementation and potentially reduce scheduler performance or introduce bugs to nuttx components that are designed to rely on the tick-based design of wdog.

Impact

Add a new high-resolution timer (hrtimer) module to NuttX.

  • When disabled, it has no impact on any existing NuttX functionality.
  • When enabled:
    1. it coexists with wdog , driving wdog by an hrtimer instance (g_nxsched_hrtimer).
    2. It provides three new APIs as described above to support nanosecond-level timers

Testing

ostest passed on board a2g-tc397-5v-tft when hrtimer is enabled

NuttShell (NSH)
nsh>
nsh> uname -a
NuttX 0.0.0 fa487accea Dec 12 2025 20:23:48 tricore a2g-tc397-5v-tft
nsh>
nsh> ostest

(...)

End of test memory usage:
VARIABLE  BEFORE   AFTER
======== ======== ========
arena       28dfc    28dfc
ordblks         7        6
mxordblk    1f8a8    1f8a8
uordblks     555c     555c
fordblks    238a0    238a0

user_main: nxevent test

End of test memory usage:
VARIABLE  BEFORE   AFTER
======== ======== ========
arena       28dfc    28dfc
ordblks         6        6
mxordblk    1f8a8    1f8a8
uordblks     555c     555c
fordblks    238a0    238a0

Final memory usage:
VARIABLE  BEFORE   AFTER
======== ======== ========
arena       28dfc    28dfc
ordblks         1        6
mxordblk    24220    1f8a8
uordblks     4bdc     555c
fordblks    24220    238a0
user_main: Exiting
ostest_main: Exiting with status 0
nsh>

wangchdo avatar Dec 12 '25 12:12 wangchdo

@wangchdo please add Documentation about this new feature @acassis Documentation added, please check

wangchdo avatar Dec 13 '25 15:12 wangchdo

This is a great initiative!

Only thing what I don't quite understand is, why is tick handling mixed with the hr timer?

While the hr timer already manages a tree of timer expirations, why can't os tick just be one subscription to the hrtimer, on "ticked" systems?

That is, on ticked systems, the OS tick could be just another registration to the hr timer, and hrtimer itself could just unconditionally manage interrupts and call the registered callbacks..

jlaitine avatar Dec 14 '25 08:12 jlaitine

This is a great initiative!

Only thing what I don't quite understand is, why is tick handling mixed with the hr timer?

While the hr timer already manages a tree of timer expirations, why can't os tick just be one subscription to the hrtimer, on "ticked" systems?

That is, on ticked systems, the OS tick could be just another registration to the hr timer, and hrtimer itself could just unconditionally manage interrupts and call the registered callbacks..

Yes, the os tick is a hr timer now, you can see from the implementation: when hrtimer is enabled an hrtimer instance(g_nxsched_hrtimer) will be used to provide OS tick

wangchdo avatar Dec 14 '25 08:12 wangchdo

Please include some test cases that verify the functionality of the new feature. It is good that ostest passes with the timer enabled, since I saw you mentioned that the hrtimer will be used for the system tick. However, I think we need tests that verify the actual precision on the hrtimer and also show that the systicks are the correct duration, etc.

@linguini1 @jerpelea I have added the test code and test logs. Please check them in the PR description.

Formal test cases will be added to the apps repository after this PR is merged.

As shown in the test code, hrtimer_init, hrtimer_start, and hrtimer_cancel are all covered. The test logs also demonstrate that the behavior matches the expected results.

wangchdo avatar Dec 15 '25 01:12 wangchdo

Please include some test cases that verify the functionality of the new feature. It is good that ostest passes with the timer enabled, since I saw you mentioned that the hrtimer will be used for the system tick. However, I think we need tests that verify the actual precision on the hrtimer and also show that the systicks are the correct duration, etc.

@linguini1 @jerpelea I have added the test code and test logs. Please check them in the PR description.

Formal test cases will be added to the apps repository after this PR is merged.

As shown in the test code, hrtimer_init, hrtimer_start, and hrtimer_cancel are all covered. The test logs also demonstrate that the behavior matches the expected results.

it's better to add the test to ostest, so we can monitor the regression regularly.

xiaoxiang781216 avatar Dec 15 '25 02:12 xiaoxiang781216

Please include some test cases that verify the functionality of the new feature. It is good that ostest passes with the timer enabled, since I saw you mentioned that the hrtimer will be used for the system tick. However, I think we need tests that verify the actual precision on the hrtimer and also show that the systicks are the correct duration, etc.

@linguini1 @jerpelea I have added the test code and test logs. Please check them in the PR description. Formal test cases will be added to the apps repository after this PR is merged. As shown in the test code, hrtimer_init, hrtimer_start, and hrtimer_cancel are all covered. The test logs also demonstrate that the behavior matches the expected results.

it's better to add the test to ostest, so we can monitor the regression regularly.

@xiaoxiang781216 @linguini1 @jerpelea

ostest added with PR below:

https://github.com/apache/nuttx-apps/pull/3248

wangchdo avatar Dec 15 '25 13:12 wangchdo

@xiaoxiang781216 @anchao @acassis @simbit18 @cederom

Per @xiaoxiang781216’s comments, I have updated the critical section protection for hrtimer to use a spinlock to improve performance.

However, to prevent potential issues where hrtimer_cancel() might cancel a running timer and the timer instance could be freed prematurely, I have introduced a state machine for hrtimer:

INACTIVE
   |
   | start
   v
 ARMED -------- cancel --------> INACTIVE
   |                               
   | expire                     
   v                              
 RUNNING -------- cancel ------> CANCELED---return---> INACTIVE
   |
   | return
   v
INACTIVE

Key points:

  1. hrtimer_cancel() remains non-blocking.
  2. If the timer callback is currently executing, it is allowed to complete.
  3. After cancellation, the callback will not be invoked again.
  4. The caller must ensure that timer-related resources are not freed until the callback has returned.

wangchdo avatar Dec 16 '25 02:12 wangchdo

@xiaoxiang781216 @anchao @acassis @simbit18 @cederom

Per @xiaoxiang781216’s comments, I have updated the critical section protection for hrtimer to use a spinlock to improve performance.

However, to prevent potential issues where hrtimer_cancel() might cancel a running timer and the timer instance could be freed prematurely, I have introduced a state machine for hrtimer:

INACTIVE
   |
   | start
   v
 ARMED -------- cancel --------> CANCELED
   |                               ^
   | expire                        |
   v                               |
 RUNNING -------- cancel ----------+
   |
   | return
   v
INACTIVE

Key points:

  1. hrtimer_cancel() remains non-blocking.
  2. If the timer callback is currently executing, it is allowed to complete.
  3. After cancellation, the callback will not be invoked again.
  4. The caller must ensure that timer-related resources are not freed until the callback has returned.

yes, but we need provide a safe api, like work_cancel_sync too.

xiaoxiang781216 avatar Dec 16 '25 03:12 xiaoxiang781216

@xiaoxiang781216 @anchao @acassis @simbit18 @cederom Per @xiaoxiang781216’s comments, I have updated the critical section protection for hrtimer to use a spinlock to improve performance. However, to prevent potential issues where hrtimer_cancel() might cancel a running timer and the timer instance could be freed prematurely, I have introduced a state machine for hrtimer:

INACTIVE
   |
   | start
   v
 ARMED -------- cancel --------> CANCELED
   |                               ^
   | expire                        |
   v                               |
 RUNNING -------- cancel ----------+
   |
   | return
   v
INACTIVE

Key points:

  1. hrtimer_cancel() remains non-blocking.
  2. If the timer callback is currently executing, it is allowed to complete.
  3. After cancellation, the callback will not be invoked again.
  4. The caller must ensure that timer-related resources are not freed until the callback has returned.

yes, but we need provide a safe api, like work_cancel_sync too.

ok, i will add it in https://github.com/apache/nuttx/pull/17517

wangchdo avatar Dec 16 '25 04:12 wangchdo

@wangchdo I suggest to include the Motivation from this Summary to the Documentation, it makes our documentation more "human-like" and people reading it will understand where and why to use hrtimer.

acassis avatar Dec 16 '25 10:12 acassis

In hard real-time applications, nanosecond-level control of task activation is essential for certain scenarios—such as motor control—where tick-level precision is simply inadequate.

For these use cases, the main limitation of wdog is its tick-based resolution, which is typically in milliseconds. Although the tick duration can be reduced, configuring it to microsecond or nanosecond granularity is impractical, as this would lead to an interrupt storm in which the CPU becomes saturated by handling an excessive number of tick interrupts.

Therefore, an independent high-resolution timer (hrtimer) is required to support use cases—such as motor control—that demand true high-precision timing.

Additionally, hrtimer uses an RB-tree (red-black tree) for timer management, which is more efficient than the list-based structure used by wdog when dealing with a large number of timer events. This advantage becomes increasingly important in hard real-time systems, such as vehicle control systems.

Done, please take a look at the latest upload or this PR.

wangchdo avatar Dec 16 '25 11:12 wangchdo

@xiaoxiang781216 @acassis @anchao @jerpelea @cederom @simbit18

Shall we first review and merge https://github.com/apache/nuttx/pull/17517 , and then proceed with this PR?

wangchdo avatar Dec 18 '25 01:12 wangchdo

@wangchdo #17517 was merged already

acassis avatar Dec 18 '25 12:12 acassis

I have restarted CI to check current status :-)

cederom avatar Dec 21 '25 19:12 cederom

@cederom @acassis @xiaoxiang781216 @anchao

This PR has been split into two parts. Part 1 https://github.com/apache/nuttx/pull/17517 has already been merged after SMP-specific optimizations. The remaining changes are now submitted as Part 2 in a separate PR: https://github.com/apache/nuttx/pull/17573

To avoid duplicate review effort, this PR is marked as draft. Please continue to review Part 2 https://github.com/apache/nuttx/pull/17573 instead.

Once Part 2 https://github.com/apache/nuttx/pull/17573 is merged, all functionality originally planned for this PR will be completed, and this PR will be closed.

wangchdo avatar Dec 21 '25 23:12 wangchdo

@cederom @acassis @xiaoxiang781216 @anchao

This PR has been split into two parts. Part 1 #17517 has already been merged after SMP-specific optimizations. The remaining changes are now submitted as Part 2 in a separate PR: #17573

To avoid duplicate review effort, this PR is marked as draft. Please continue to review Part 2 #17573 instead.

Once Part 2 #17573 is merged, all functionality originally planned for this PR will be completed, and this PR will be closed.

so, let's close this pr directly and continue on #17573. If you want, we can reopen it at anytime.

xiaoxiang781216 avatar Dec 22 '25 03:12 xiaoxiang781216