RIOT
RIOT copied to clipboard
ztimer: add ztimer_ondemand module for implicit power management
Contribution description
This PR implements the approach proposed in #16327. Basically it extends ztimer
to start and stop ztimer_periph_%
depending on the count of currently active users. This solves problems like described in #16891.
The ztimer_ondemand
feature can be enabled per peripheral driver. To start/stop periph_timer
, ztimer_ondemand_timer
has to be pulled in. periph_rtt
is started/stopped by ztimer_ondemand_rtt
and periph_rtc
by ztimer_ondemand_rtc
. I think you get the idea ;-)
To help ztimer
to understand if someone is currently relaying on a certain ztimer_clock_t
, ztimer_acquire()
and ztimer_release()
have been introduced. These methods are required for users of ztimer_now()
. The interface works like this:
ztimer_acquire(ZTIMER_USEC);
ztimer_now_t start = ztimer_now(ZTIMER_USEC);
do_some_work();
ztimer_now_t duration = ztimer_now(ZTIMER_USEC) - start;
ztimer_release(ZTIMER_USEC);
(ztimer_set()
and ztimer_remove()
will call ztimer_acquire()
and ztimer_release()
internally.)
First of all, I'd love to hear your opinion about the direction I'm heading in.
Testing procedure
I've added unit tests.
~~During the next days I'll come up with some sample code that you may run on your device.~~
A demo showcasing ztimer_ondemand:
$ make -C tests/ztimer_ondemand BOARD=samr30-xpro flash term
(...)
2022-02-04 16:35:46,792 # main(): This is RIOT! (Version: 2022.04-devel-228-g0957b3-feature/ztimer-ondemand)
2022-02-04 16:35:46,792 #
2022-02-04 16:35:46,796 # Loop gpio_set(LED_PIN) 10000 times without any delay
2022-02-04 16:35:46,800 # .
2022-02-04 16:35:46,801 #
2022-02-04 16:35:46,808 # Loop gpio_set(LED_PIN) 10000 times with ztimer_sleep(1000) with clock start/stop.
2022-02-04 16:35:57,167 # .
2022-02-04 16:35:57,167 #
2022-02-04 16:35:57,174 # Loop gpio_set(LED_PIN) 10000 times with ztimer_sleep(1000) without clock start/stop.
2022-02-04 16:36:07,571 # .
2022-02-04 16:36:07,571 #
2022-02-04 16:36:07,577 # Loop gpio_set(LED_PIN) 10000 times with ztimer_periodic_wakeup(1000).
2022-02-04 16:36:17,583 # .
Issues/PRs references
#16327 #16891
Sneak peak: I'm completely aware of ztimer_acquire()
and ztimer_release()
introducing more complexity. But this interface could be just a foundation for something like this:
ztimer_watch
typedef struct {
ztimer_now_t checkpoint; /**< last seen timer now value */
ztimer_now_t value; /**< watch time at checkpoint */
bool running; /**< flag showing if the watch is running */
} ztimer_watch_t;
void ztimer_watch_start(ztimer_clock_t *clock, ztimer_watch_t *watch)
{
if (watch->running) {
return;
}
ztimer_acquire(clock);
watch->checkpoint = ztimer_now(clock);
running = true;
}
void ztimer_watch_set(ztimer_clock_t *clock, ztimer_watch_t *watch, ztimer_now_t val)
{
if (watch->running) {
watch->checkpoint = ztimer_now(clock);
}
watch->value = val;
}
ztimer_now_t ztimer_watch_read(ztimer_clock_t *clock, ztimer_watch_t *watch)
{
if (watch->running) {
ztimer_now_t now = ztimer_now(clock);
return now - watch->checkpoint + watch->value;
}
else {
return watch->value;
}
}
void ztimer_watch_stop(ztimer_clock_t *clock, ztimer_watch_t *watch)
{
if (!watch->running) {
return;
}
watch->value = ztimer_watch_read(clock, watch);
watch->running = false;
ztimer_release(clock);
}
We should make every user of ztimer_now()
migrate to an interface as shown above.
Or we could introduce a global ztimer_watch_t uptime
that is backed by ZTIMER_MSEC
and started during auto_init
. For time measurement, one may call ztimer_watch_read(ZTIMER_MSEC, &uptime)
.
Is this really an API change? The use of ztimer_now
has a pending PR #17298 documenting previously silent assumptions made; I think that's the real API chan^Wfix here, whereas this PR just makes ztimer use its potential.
On acquire and release, I wonder if they are actually relevant (and warrant their counter). Any user just comparing the ztimer_now outputs will find they have a hard time knowing whether it wrapped, whereas if someone acquiring the ztimer just placed a maximal timeout and then checked for it still being set after obtaining a later now value would be sure that it did not wrap (or, it may have wrapped but the wrapping difference never overflew). Their timer callback could be a no-op (the task will conclude, and the measured time will be "exceeding the scale"), or do some cancellation ("don't bother, it has already taken too long").
(The uptime counter would need special handling then, though.)
Thank you for your response!
Yes, I agree that setting up a ztimer_t
with a very long offset can also be a solution to make the pm_layered
approach work. But it feels more like a workaround than a proper solution. TBH, I'm regretting that I've introduced this approach in the first place. It seemed to be easy and elegant one but I've shot myself into the foot several times :D
You may have noticed: This PR disables all these mechanics, once you pull in ztimer_ondemand
(I'm still not sure if this name is verbose enough ... but I wasn't able to come up with a better one, yet.) Once we found a working solution that doesn't require any workarounds, I'd say we should introduce pm_layered
into all periph_timer
implementation and then deprecate the mechanics around direct interaction between ztimer
and pm_layered
.
After having made my first baby steps with this approach, I've already noticed that this implicit power management fails fast, which is a good thing! If a ztimer_acquire()
is missing, you'll notice pretty fast because the periph_timer
is disabled even if the MCU is still on fire and hasn't entered any power-saving modes, yet. I'd say those bugs that show themselves only by accident are really annoying type of bugs. (I.e. you changed timing of your application and, thus, different pm modes are entered at different times and see stuff like networking starting to behave weirdly. This will happen if ztimer_now()
is used to time stuff - and it is!)
An addition to the wrapped ztimer_now() values - I just understood your point after reading your response a second time: If there must the guarantee that it hasn't wrapped during time measurement, we would have to use ztimer_now64
. That should create enough room that timer_now() never wraps.
Your approach clearly detects wraps: But what shall we do if we detected wrapping timers? Extend like ztimer_now64
does?
Your approach clearly detects wraps: But what shall we do if we detected wrapping timers? Extend like
ztimer_now64
does?
Something like a watch, when detecting a wrapped timer, should IMO just report "longer than I can represent". A user who asked for a 32bit value in some units is not prepared for any larger values anyway.
Some additional work: I've introduced ztimer_ondemand_strict
with 04607ed. It will make sure ztimer_now()
is only called if the timer is active. Of course, this pseudomodule brakes almost every module using ztimer
. But it already showed me that I used ztimer_now()
in my patch before turning on the clock! This has been fixed with ba77e7b.
Furthermore, I wrote a little example: tests/ztimer_ondemand
for you to run on your boards :)
Would anybody mind if I squash commits?
I think I have finished for the first round and implemented everything necessary for the ztimer_ondemand
driver to work properly.
I wouldn't mind, and I think that kfessel's comment is high-level enough that a squash wouldn't disturb it.
And squashed. I'd say this PR is ready to be reviewed.
I've fixed typos and added some breaks to long lines. Static tests are green now.
This is running tests/bench_xtimer
with USEMODULE=ztimer_xtimer_compat
on nrf52840dk, first on master, then this PR (rebased on master):
main(): This is RIOT! (Version: buildtest)
xtimer benchmark application.
set() one 6501 / 1000 = 6
remove() one 943 / 1000 = 0
set() + remove() one 10690 / 1000 = 10
set() many increasing target 112317 / 1000 = 112
re-set() first 6564 / 1000 = 6
re-set() middle 169686 / 1000 = 169
re-set() last 331090 / 1000 = 331
remove() + set() first 11479 / 1000 = 11
remove() + set() middle 175118 / 1000 = 175
remove() + set() last 336466 / 1000 = 336
remove() many decreasing 60732 / 1000 = 60
xtimer_now() 1189 / 1000 = 1
sizeof(xtimer_t) 16000 / 1000 = 16
done.
Help: Press s to start test, r to print it is ready
START
main(): This is RIOT! (Version: buildtest)
xtimer benchmark application.
set() one 6501 / 1000 = 6
remove() one 959 / 1000 = 0
set() + remove() one 10752 / 1000 = 10
set() many increasing target 112286 / 1000 = 112
re-set() first 6564 / 1000 = 6
re-set() middle 193808 / 1000 = 193
re-set() last 343817 / 1000 = 343
remove() + set() first 11417 / 1000 = 11
remove() + set() middle 175090 / 1000 = 175
remove() + set() last 336404 / 1000 = 336
remove() many decreasing 60732 / 1000 = 60
xtimer_now() 1189 / 1000 = 1
sizeof(xtimer_t) 16000 / 1000 = 16
done.
~~There seems to be some minor impact on the re-set middle and re-set last. Accteptable I'd say.~~
edit I didn't enable ztimer_ondemand, so it probably wasn't used. Makes me wonder where the difference comes from. :/
I didn't enable ztimer_ondemand, so it probably wasn't used. Makes me wonder where the difference comes from. :/
Yes, it's opt-in right now. It will break things if turned on by default as RIOT modules are using ztimer_now()
. I'm planning one ztimer_acquire(ZTIMER_MSEC)
during ztimer_init
until every user is aware of the acquire/release interface.
I guess, I'm expecting too much from the compiler in terms of optimizations?! I'm recording whether the timer has been removed from the clock list. That bool is never used if ztimer_ondemand
isn't active.
https://github.com/RIOT-OS/RIOT/blob/309cbfefe8ed95e489138902a84a3f1826315eea/sys/ztimer/core.c#L156-L159 (...snip...) https://github.com/RIOT-OS/RIOT/blob/309cbfefe8ed95e489138902a84a3f1826315eea/sys/ztimer/core.c#L184-L192
But thank you for this hint! I'll keep on eye on this while polishing ztimer_ondemand
.
I like this, too!
Thank you all for your valuable feedback so far :)
As @chrysn pointed out, with merging #17614 this isn't a change but an addition to the ztimer API.
I added Kconfig-related modelling of this module and rebased on latest master.
I learned stuff about Kconfig - so I think the last two commits improved usability for menuconfig
users and TEST_KCONFIG=1
.
I'd say this is finally ready for review :-)
Is this still current w/rt the discussions from the summit?
If so, are the previous reviewers happy?
If not, can the ztimer_acquire / release functions be split out so that applications can be fixed to use them (to comply more easily with the documented requirements for using ztimer_now), even if they are no-ops before this (or its later form) is merged?
This PR should add a line to the ztimer_now
; where it now says:
Two values can only be compared when the clock has been continuously active between the first and the second reading.
A clock is guaranteed to be active from [...]
there should be a paragraph inbetween that says: "The easiest way to guarantee that a clock is continuously active is by keeping it active through @ref ztimer_acquire"; the next paragraph would then say "A clock is also guaranteed ..."
Is this still current w/rt the discussions from the summit?
Yes! We are already using this patch in semi-production :-D
If so, are the previous reviewers happy?
If not, can the ztimer_acquire / release functions be split out so that applications can be fixed to use them (to comply more easily with the documented requirements for using ztimer_now), even if they are no-ops before this (or its later form) is merged?
I took the other route - ztimer_ondemand
is opt-in with this PR. Users have to enable it actively. If they do, the asserts will point on modules using ztimer_now()
without requiring the ztimer clock.
And TBH - I'd rather go the route to introduce a wrapper that eliminates ztimer_now()
calls wherever possible.
I took the other route -
ztimer_ondemand
is opt-in with this PR
I hope that's only a transitional solution -- users should not need to drill into what they have to enable just to get power savings. (But given there will be assertions triggering, this may make sense for a release cycle or two).
And TBH - I'd rather go the route to introduce a wrapper that eliminates
ztimer_now()
calls wherever possible.
But these wrappers will be implemented in terms of ztimer_acquire()
,
right? (Point in case: Implementing the groundhog timer on top of ztimer
will just do the right thing, but I'll need these functions...).
This PR should add a line to the
ztimer_now
; where it now says:Two values can only be compared when the clock has been continuously active between the first and the second reading. A clock is guaranteed to be active from [...]
there should be a paragraph inbetween that says: "The easiest way to guarantee that a clock is continuously active is by keeping it active through @ref ztimer_acquire"; the next paragraph would then say "A clock is also guaranteed ..."
I added a fixup clarifying this.
I took the other route -
ztimer_ondemand
is opt-in with this PRI hope that's only a transitional solution -- users should not need to drill into what they have to enable just to get power savings. (But given there will be assertions triggering, this may make sense for a release cycle or two).
Sure! ztimer_ondemand
will break almost every application in the current RIOT source tree. It's a measure to control the transition.
And TBH - I'd rather go the route to introduce a wrapper that eliminates
ztimer_now()
calls wherever possible. But these wrappers will be implemented in terms ofztimer_acquire()
, right? (Point in case: Implementing the groundhog timer on top of ztimer will just do the right thing, but I'll need these functions...).
Sure, it's on top. I wouldn't add it to the core API. If you're writing RUST wrappers, I should use the core functions ztimer_now()
, ztimer_requires()
, ztimer_release()
, I'd say. I was talking about users like drivers, applications, ... that have to measure time differences.
@jue89 wouldn't it be better to put this into ztimer64 -which cares about absolute time (zclock64 might have been a better name) - ( it also does the timer management to keep underlying timer running); if ztimer64 would be able to stop we would gain the powermanagement back that ztimer64 is blocking.
@jue89 wouldn't it be better to put this into ztimer64 -which cares about absolute time (zclock64 might have been a better name) - ( it also does the timer management to keep underlying timer running); if ztimer64 would be able to stop we would gain the powermanagement back that ztimer64 is blocking.
What do you mean? The ondemand feature?
Currently, ztimer64
isn't compatible with ztimer_ondemand
. Using ztimer64
calls ztimer_set()
for checkpointing. Thus, a ztimer_clock_t
user is present all the time.
What do you mean? The ondemand feature?
yes, better have it there than in ztimer,
Currently,
ztimer64
isn't compatible withztimer_ondemand
. Usingztimer64
callsztimer_set()
for checkpointing. Thus, aztimer_clock_t
user is present all the time.
exactly the checkpointing is keeping a underlying ztimer running and if there is no checkpoint the ztimer_clock is not kept running -> it a good place to count users that need absolut time ( now() ) and if there is no one in need of now() information we can remove the checkpoint
What do you mean? The ondemand feature?
yes, better have it there than in ztimer,
ztimer must have the power management integrated. It's the link between ztimer64
and periph_*
.
The current pm integration of ztimer
sucks. (I think I'm allowed to say that - I'm the author ;-)) It's incompatible with CPUs like the nrf52
.
I doubt that anything that uses ztimer64 will ever even come close to using this kind of power saving. As it needs the timer running for longer than 2**32 ticks, times between these intervals in which it might be switched off likely don't matter.
Murdock results
:heavy_check_mark: PASSED
8d6d6f2bbc980e45daf461f4a2cc16a013789f0b ztimer: add benchmarking tool
Success | Failures | Total | Runtime |
---|---|---|---|
118094 | 0 | 118094 | 02h:30m:50s |
Artifacts
This only reflects a subset of all builds from https://ci-prod.riot-os.org. Please refer to https://ci.riot-os.org for a complete build for now.
And introducing such interfaces without any implementation behind can result in a situation where future implementations doesn't work with that introduced interface.
I don't see how that breakage would be worse that what we have now.
It would allow users to write their applications properly, use the ztimer_on_demand module even if it's a no-op, and trust that whenever RIOT is ready, it'll make the best of the situation. Frankly, I'd even be well OK with taking a performance penalty at timer shutoff while ztimer_on_demand is still opt-in -- after all, until it's default, I'm actively asking for power savings, and by not having a timer running I somewhat indicate indicate already that there's no hurry.
But I'm only a user and bystander on this, so if this subthread is being productive, please don't let me stop it :-)
And introducing such interfaces without any implementation behind can result in a situation where future implementations doesn't work with that introduced interface.
I don't see how that breakage would be worse that what we have now.
We may loose users if we force them to adapt their applications twice.
I'm actually in a hurry with that topic. I don't want to maintain a second fork of RIOT - the first one is already a burden and eats up a lot of time. But of course, this is just my problem - not the problem of the RIOT community.
I cannot stress enough that proper power management is one of the core features people expect of an OS for an IoT. We should focus on getting this upstream and not letting perfect being the enemy of good here.
Please let's not delay this over bike shedding of poor handling a corner case in combination with an RTT that is broken by design, especially since this is an implementation detail from the ztimer point of view.
@maribu since you are talking about broken mcu designs: Why design this around the broken PM of the NRF? (you yourself told me its is the only ARM where the Systick does not stop of when you enter idle making it unusable for certain usecases). On any other mcu we turn the ocillator off and the timer stops (without needing to sync or anything), only the nrf needs the timer to stop in advance why don't you stop the timer when pm_set() is called? This is the point in time where any other mcu stops the timer that are connected to to the oscillators that are are halted for that pm.
Fix NRF stop timer in pm set.
Calling timer_stop really hurts the allready crippled sam0 timer implementation. And my comment is not about timer precision its about all other interrupts that are not able to trigger in that extra 1uSec where the timer is unnessecarily stopped: Any comunication is halted for another usec, a speedy UART may overflow in that time span not even thinking about SPI, gpio irq may need to wait 2uSec.
-just to repeat it is hurting everything else on that mcu. Its not that one does not care about reaction times if they turned the timer off they just do not care about the amout or point in time.
And if block an unblock realy take a usec on that mcu we need to think about its type (uint8 might safe space but writing an uint might be faster (uint8_fast_t ?) or going for something atomic rather then irq dis and enable.
And i am not talking about the color of the shed that on would be whether aquire()/release() is the right API -I think it is very prone to leak and we got the zimer_set where you just tell how_much time you want the timer to run you could even have something that refreshes itself (simulating the default behavior of aquire), that would be less convinient but this is also something you should think about do i realy need a timer running all the time, while with this api one calls aquire and the timer is running.
On the other hand aquire and release are the api that is most complete.