RIOT icon indicating copy to clipboard operation
RIOT copied to clipboard

ztimer: add ztimer_ondemand module for implicit power management

Open jue89 opened this issue 3 years ago • 17 comments

Contribution description

This PR implements the approach proposed in #16327. Basically it extends ztimer to start and stop ztimer_periph_% depending on the count of currently active users. This solves problems like described in #16891.

The ztimer_ondemand feature can be enabled per peripheral driver. To start/stop periph_timer, ztimer_ondemand_timer has to be pulled in. periph_rtt is started/stopped by ztimer_ondemand_rtt and periph_rtc by ztimer_ondemand_rtc. I think you get the idea ;-)

To help ztimer to understand if someone is currently relaying on a certain ztimer_clock_t, ztimer_acquire() and ztimer_release() have been introduced. These methods are required for users of ztimer_now(). The interface works like this:

ztimer_acquire(ZTIMER_USEC);
ztimer_now_t start = ztimer_now(ZTIMER_USEC);
do_some_work();
ztimer_now_t duration = ztimer_now(ZTIMER_USEC) - start;
ztimer_release(ZTIMER_USEC);

(ztimer_set() and ztimer_remove() will call ztimer_acquire() and ztimer_release() internally.)

First of all, I'd love to hear your opinion about the direction I'm heading in.

Testing procedure

I've added unit tests.

~~During the next days I'll come up with some sample code that you may run on your device.~~

A demo showcasing ztimer_ondemand:

$ make -C tests/ztimer_ondemand BOARD=samr30-xpro flash term
(...)
2022-02-04 16:35:46,792 # main(): This is RIOT! (Version: 2022.04-devel-228-g0957b3-feature/ztimer-ondemand)
2022-02-04 16:35:46,792 # 
2022-02-04 16:35:46,796 # Loop gpio_set(LED_PIN) 10000 times without any delay
2022-02-04 16:35:46,800 # .
2022-02-04 16:35:46,801 # 
2022-02-04 16:35:46,808 # Loop gpio_set(LED_PIN) 10000 times with ztimer_sleep(1000) with clock start/stop.
2022-02-04 16:35:57,167 # .
2022-02-04 16:35:57,167 # 
2022-02-04 16:35:57,174 # Loop gpio_set(LED_PIN) 10000 times with ztimer_sleep(1000) without clock start/stop.
2022-02-04 16:36:07,571 # .
2022-02-04 16:36:07,571 # 
2022-02-04 16:36:07,577 # Loop gpio_set(LED_PIN) 10000 times with ztimer_periodic_wakeup(1000).
2022-02-04 16:36:17,583 # .

Issues/PRs references

#16327 #16891

jue89 avatar Feb 02 '22 21:02 jue89

Sneak peak: I'm completely aware of ztimer_acquire() and ztimer_release() introducing more complexity. But this interface could be just a foundation for something like this:

ztimer_watch
typedef struct {
    ztimer_now_t checkpoint; /**< last seen timer now value */
    ztimer_now_t value;      /**< watch time at checkpoint */
    bool running;            /**< flag showing if the watch is running */
} ztimer_watch_t;

void ztimer_watch_start(ztimer_clock_t *clock, ztimer_watch_t *watch)
{
    if (watch->running) {
        return;
    }

    ztimer_acquire(clock);
    watch->checkpoint = ztimer_now(clock);
    running = true;
}

void ztimer_watch_set(ztimer_clock_t *clock, ztimer_watch_t *watch, ztimer_now_t val)
{
    if (watch->running) {
        watch->checkpoint = ztimer_now(clock);
    }

    watch->value = val;
}

ztimer_now_t ztimer_watch_read(ztimer_clock_t *clock, ztimer_watch_t *watch)
{
    if (watch->running) {
        ztimer_now_t now = ztimer_now(clock);
        return now - watch->checkpoint + watch->value;
    }
    else {
        return watch->value;
    }
}

void ztimer_watch_stop(ztimer_clock_t *clock, ztimer_watch_t *watch)
{
    if (!watch->running) {
        return;
    }

    watch->value = ztimer_watch_read(clock, watch);
    watch->running = false;
    ztimer_release(clock);
}

We should make every user of ztimer_now() migrate to an interface as shown above.

Or we could introduce a global ztimer_watch_t uptime that is backed by ZTIMER_MSEC and started during auto_init. For time measurement, one may call ztimer_watch_read(ZTIMER_MSEC, &uptime).

jue89 avatar Feb 02 '22 22:02 jue89

Is this really an API change? The use of ztimer_now has a pending PR #17298 documenting previously silent assumptions made; I think that's the real API chan^Wfix here, whereas this PR just makes ztimer use its potential.

On acquire and release, I wonder if they are actually relevant (and warrant their counter). Any user just comparing the ztimer_now outputs will find they have a hard time knowing whether it wrapped, whereas if someone acquiring the ztimer just placed a maximal timeout and then checked for it still being set after obtaining a later now value would be sure that it did not wrap (or, it may have wrapped but the wrapping difference never overflew). Their timer callback could be a no-op (the task will conclude, and the measured time will be "exceeding the scale"), or do some cancellation ("don't bother, it has already taken too long").

(The uptime counter would need special handling then, though.)

chrysn avatar Feb 03 '22 11:02 chrysn

Thank you for your response!

Yes, I agree that setting up a ztimer_t with a very long offset can also be a solution to make the pm_layered approach work. But it feels more like a workaround than a proper solution. TBH, I'm regretting that I've introduced this approach in the first place. It seemed to be easy and elegant one but I've shot myself into the foot several times :D

You may have noticed: This PR disables all these mechanics, once you pull in ztimer_ondemand (I'm still not sure if this name is verbose enough ... but I wasn't able to come up with a better one, yet.) Once we found a working solution that doesn't require any workarounds, I'd say we should introduce pm_layered into all periph_timer implementation and then deprecate the mechanics around direct interaction between ztimer and pm_layered.

After having made my first baby steps with this approach, I've already noticed that this implicit power management fails fast, which is a good thing! If a ztimer_acquire() is missing, you'll notice pretty fast because the periph_timer is disabled even if the MCU is still on fire and hasn't entered any power-saving modes, yet. I'd say those bugs that show themselves only by accident are really annoying type of bugs. (I.e. you changed timing of your application and, thus, different pm modes are entered at different times and see stuff like networking starting to behave weirdly. This will happen if ztimer_now() is used to time stuff - and it is!)

jue89 avatar Feb 03 '22 11:02 jue89

An addition to the wrapped ztimer_now() values - I just understood your point after reading your response a second time: If there must the guarantee that it hasn't wrapped during time measurement, we would have to use ztimer_now64. That should create enough room that timer_now() never wraps.

Your approach clearly detects wraps: But what shall we do if we detected wrapping timers? Extend like ztimer_now64 does?

jue89 avatar Feb 03 '22 15:02 jue89

Your approach clearly detects wraps: But what shall we do if we detected wrapping timers? Extend like ztimer_now64 does?

Something like a watch, when detecting a wrapped timer, should IMO just report "longer than I can represent". A user who asked for a 32bit value in some units is not prepared for any larger values anyway.

chrysn avatar Feb 03 '22 18:02 chrysn

Some additional work: I've introduced ztimer_ondemand_strict with 04607ed. It will make sure ztimer_now() is only called if the timer is active. Of course, this pseudomodule brakes almost every module using ztimer. But it already showed me that I used ztimer_now() in my patch before turning on the clock! This has been fixed with ba77e7b.

Furthermore, I wrote a little example: tests/ztimer_ondemand for you to run on your boards :)

jue89 avatar Feb 04 '22 15:02 jue89

Would anybody mind if I squash commits?

I think I have finished for the first round and implemented everything necessary for the ztimer_ondemand driver to work properly.

jue89 avatar Feb 04 '22 18:02 jue89

I wouldn't mind, and I think that kfessel's comment is high-level enough that a squash wouldn't disturb it.

chrysn avatar Feb 04 '22 19:02 chrysn

And squashed. I'd say this PR is ready to be reviewed.

I've fixed typos and added some breaks to long lines. Static tests are green now.

jue89 avatar Feb 07 '22 12:02 jue89

This is running tests/bench_xtimer with USEMODULE=ztimer_xtimer_compat on nrf52840dk, first on master, then this PR (rebased on master):

main(): This is RIOT! (Version: buildtest)
xtimer benchmark application.

                     set() one     6501 / 1000 = 6
                  remove() one      943 / 1000 = 0
          set() + remove() one    10690 / 1000 = 10
  set() many increasing target   112317 / 1000 = 112
               re-set()  first     6564 / 1000 = 6
               re-set() middle   169686 / 1000 = 169
               re-set()   last   331090 / 1000 = 331
       remove() + set()  first    11479 / 1000 = 11
       remove() + set() middle   175118 / 1000 = 175
       remove() + set()   last   336466 / 1000 = 336
      remove() many decreasing    60732 / 1000 = 60
                  xtimer_now()     1189 / 1000 = 1
              sizeof(xtimer_t)    16000 / 1000 = 16
done.
Help: Press s to start test, r to print it is ready
START
main(): This is RIOT! (Version: buildtest)
xtimer benchmark application.

                     set() one     6501 / 1000 = 6
                  remove() one      959 / 1000 = 0
          set() + remove() one    10752 / 1000 = 10
  set() many increasing target   112286 / 1000 = 112
               re-set()  first     6564 / 1000 = 6
               re-set() middle   193808 / 1000 = 193
               re-set()   last   343817 / 1000 = 343
       remove() + set()  first    11417 / 1000 = 11
       remove() + set() middle   175090 / 1000 = 175
       remove() + set()   last   336404 / 1000 = 336
      remove() many decreasing    60732 / 1000 = 60
                  xtimer_now()     1189 / 1000 = 1
              sizeof(xtimer_t)    16000 / 1000 = 16
done.

~~There seems to be some minor impact on the re-set middle and re-set last. Accteptable I'd say.~~

edit I didn't enable ztimer_ondemand, so it probably wasn't used. Makes me wonder where the difference comes from. :/

kaspar030 avatar Feb 15 '22 12:02 kaspar030

I didn't enable ztimer_ondemand, so it probably wasn't used. Makes me wonder where the difference comes from. :/

Yes, it's opt-in right now. It will break things if turned on by default as RIOT modules are using ztimer_now(). I'm planning one ztimer_acquire(ZTIMER_MSEC) during ztimer_init until every user is aware of the acquire/release interface.

I guess, I'm expecting too much from the compiler in terms of optimizations?! I'm recording whether the timer has been removed from the clock list. That bool is never used if ztimer_ondemand isn't active.

https://github.com/RIOT-OS/RIOT/blob/309cbfefe8ed95e489138902a84a3f1826315eea/sys/ztimer/core.c#L156-L159 (...snip...) https://github.com/RIOT-OS/RIOT/blob/309cbfefe8ed95e489138902a84a3f1826315eea/sys/ztimer/core.c#L184-L192

But thank you for this hint! I'll keep on eye on this while polishing ztimer_ondemand.

jue89 avatar Feb 15 '22 12:02 jue89

I like this, too!

kaspar030 avatar Feb 15 '22 12:02 kaspar030

Thank you all for your valuable feedback so far :)

As @chrysn pointed out, with merging #17614 this isn't a change but an addition to the ztimer API.

jue89 avatar Feb 15 '22 13:02 jue89

I added Kconfig-related modelling of this module and rebased on latest master.

jue89 avatar Mar 26 '22 17:03 jue89

I learned stuff about Kconfig - so I think the last two commits improved usability for menuconfig users and TEST_KCONFIG=1.

I'd say this is finally ready for review :-)

jue89 avatar Mar 30 '22 13:03 jue89

Is this still current w/rt the discussions from the summit?

If so, are the previous reviewers happy?

If not, can the ztimer_acquire / release functions be split out so that applications can be fixed to use them (to comply more easily with the documented requirements for using ztimer_now), even if they are no-ops before this (or its later form) is merged?

chrysn avatar Sep 19 '22 13:09 chrysn

This PR should add a line to the ztimer_now; where it now says:

Two values can only be compared when the clock has been continuously active between the first and the second reading.

A clock is guaranteed to be active from [...]

there should be a paragraph inbetween that says: "The easiest way to guarantee that a clock is continuously active is by keeping it active through @ref ztimer_acquire"; the next paragraph would then say "A clock is also guaranteed ..."

chrysn avatar Sep 19 '22 13:09 chrysn

Is this still current w/rt the discussions from the summit?

Yes! We are already using this patch in semi-production :-D

If so, are the previous reviewers happy?

If not, can the ztimer_acquire / release functions be split out so that applications can be fixed to use them (to comply more easily with the documented requirements for using ztimer_now), even if they are no-ops before this (or its later form) is merged?

I took the other route - ztimer_ondemand is opt-in with this PR. Users have to enable it actively. If they do, the asserts will point on modules using ztimer_now() without requiring the ztimer clock.

And TBH - I'd rather go the route to introduce a wrapper that eliminates ztimer_now() calls wherever possible.

jue89 avatar Sep 22 '22 11:09 jue89

I took the other route - ztimer_ondemand is opt-in with this PR

I hope that's only a transitional solution -- users should not need to drill into what they have to enable just to get power savings. (But given there will be assertions triggering, this may make sense for a release cycle or two).

And TBH - I'd rather go the route to introduce a wrapper that eliminates ztimer_now() calls wherever possible.

But these wrappers will be implemented in terms of ztimer_acquire(), right? (Point in case: Implementing the groundhog timer on top of ztimer will just do the right thing, but I'll need these functions...).

chrysn avatar Sep 22 '22 11:09 chrysn

This PR should add a line to the ztimer_now; where it now says:

Two values can only be compared when the clock has been continuously active between the first and the second reading. A clock is guaranteed to be active from [...]

there should be a paragraph inbetween that says: "The easiest way to guarantee that a clock is continuously active is by keeping it active through @ref ztimer_acquire"; the next paragraph would then say "A clock is also guaranteed ..."

I added a fixup clarifying this.

I took the other route - ztimer_ondemand is opt-in with this PR

I hope that's only a transitional solution -- users should not need to drill into what they have to enable just to get power savings. (But given there will be assertions triggering, this may make sense for a release cycle or two).

Sure! ztimer_ondemand will break almost every application in the current RIOT source tree. It's a measure to control the transition.

And TBH - I'd rather go the route to introduce a wrapper that eliminates ztimer_now() calls wherever possible. But these wrappers will be implemented in terms of ztimer_acquire(), right? (Point in case: Implementing the groundhog timer on top of ztimer will just do the right thing, but I'll need these functions...).

Sure, it's on top. I wouldn't add it to the core API. If you're writing RUST wrappers, I should use the core functions ztimer_now(), ztimer_requires(), ztimer_release(), I'd say. I was talking about users like drivers, applications, ... that have to measure time differences.

jue89 avatar Sep 22 '22 11:09 jue89

@jue89 wouldn't it be better to put this into ztimer64 -which cares about absolute time (zclock64 might have been a better name) - ( it also does the timer management to keep underlying timer running); if ztimer64 would be able to stop we would gain the powermanagement back that ztimer64 is blocking.

kfessel avatar Sep 29 '22 13:09 kfessel

@jue89 wouldn't it be better to put this into ztimer64 -which cares about absolute time (zclock64 might have been a better name) - ( it also does the timer management to keep underlying timer running); if ztimer64 would be able to stop we would gain the powermanagement back that ztimer64 is blocking.

What do you mean? The ondemand feature?

Currently, ztimer64 isn't compatible with ztimer_ondemand. Using ztimer64 calls ztimer_set() for checkpointing. Thus, a ztimer_clock_t user is present all the time.

jue89 avatar Sep 29 '22 13:09 jue89

What do you mean? The ondemand feature?

yes, better have it there than in ztimer,

Currently, ztimer64 isn't compatible with ztimer_ondemand. Using ztimer64 calls ztimer_set() for checkpointing. Thus, a ztimer_clock_t user is present all the time.

exactly the checkpointing is keeping a underlying ztimer running and if there is no checkpoint the ztimer_clock is not kept running -> it a good place to count users that need absolut time ( now() ) and if there is no one in need of now() information we can remove the checkpoint

kfessel avatar Sep 29 '22 14:09 kfessel

What do you mean? The ondemand feature?

yes, better have it there than in ztimer,

ztimer must have the power management integrated. It's the link between ztimer64 and periph_*.

The current pm integration of ztimer sucks. (I think I'm allowed to say that - I'm the author ;-)) It's incompatible with CPUs like the nrf52.

jue89 avatar Sep 29 '22 14:09 jue89

I doubt that anything that uses ztimer64 will ever even come close to using this kind of power saving. As it needs the timer running for longer than 2**32 ticks, times between these intervals in which it might be switched off likely don't matter.

chrysn avatar Oct 11 '22 09:10 chrysn

Murdock results

:heavy_check_mark: PASSED

8d6d6f2bbc980e45daf461f4a2cc16a013789f0b ztimer: add benchmarking tool

Success Failures Total Runtime
118094 0 118094 02h:30m:50s

Artifacts

This only reflects a subset of all builds from https://ci-prod.riot-os.org. Please refer to https://ci.riot-os.org for a complete build for now.

riot-ci avatar Oct 27 '22 19:10 riot-ci

And introducing such interfaces without any implementation behind can result in a situation where future implementations doesn't work with that introduced interface.

I don't see how that breakage would be worse that what we have now.

It would allow users to write their applications properly, use the ztimer_on_demand module even if it's a no-op, and trust that whenever RIOT is ready, it'll make the best of the situation. Frankly, I'd even be well OK with taking a performance penalty at timer shutoff while ztimer_on_demand is still opt-in -- after all, until it's default, I'm actively asking for power savings, and by not having a timer running I somewhat indicate indicate already that there's no hurry.

But I'm only a user and bystander on this, so if this subthread is being productive, please don't let me stop it :-)

chrysn avatar Nov 01 '22 16:11 chrysn

And introducing such interfaces without any implementation behind can result in a situation where future implementations doesn't work with that introduced interface.

I don't see how that breakage would be worse that what we have now.

We may loose users if we force them to adapt their applications twice.

I'm actually in a hurry with that topic. I don't want to maintain a second fork of RIOT - the first one is already a burden and eats up a lot of time. But of course, this is just my problem - not the problem of the RIOT community.

jue89 avatar Nov 01 '22 16:11 jue89

I cannot stress enough that proper power management is one of the core features people expect of an OS for an IoT. We should focus on getting this upstream and not letting perfect being the enemy of good here.

Please let's not delay this over bike shedding of poor handling a corner case in combination with an RTT that is broken by design, especially since this is an implementation detail from the ztimer point of view.

maribu avatar Nov 01 '22 18:11 maribu

@maribu since you are talking about broken mcu designs: Why design this around the broken PM of the NRF? (you yourself told me its is the only ARM where the Systick does not stop of when you enter idle making it unusable for certain usecases). On any other mcu we turn the ocillator off and the timer stops (without needing to sync or anything), only the nrf needs the timer to stop in advance why don't you stop the timer when pm_set() is called? This is the point in time where any other mcu stops the timer that are connected to to the oscillators that are are halted for that pm.

Fix NRF stop timer in pm set.

Calling timer_stop really hurts the allready crippled sam0 timer implementation. And my comment is not about timer precision its about all other interrupts that are not able to trigger in that extra 1uSec where the timer is unnessecarily stopped: Any comunication is halted for another usec, a speedy UART may overflow in that time span not even thinking about SPI, gpio irq may need to wait 2uSec.

-just to repeat it is hurting everything else on that mcu. Its not that one does not care about reaction times if they turned the timer off they just do not care about the amout or point in time.

And if block an unblock realy take a usec on that mcu we need to think about its type (uint8 might safe space but writing an uint might be faster (uint8_fast_t ?) or going for something atomic rather then irq dis and enable.

And i am not talking about the color of the shed that on would be whether aquire()/release() is the right API -I think it is very prone to leak and we got the zimer_set where you just tell how_much time you want the timer to run you could even have something that refreshes itself (simulating the default behavior of aquire), that would be less convinient but this is also something you should think about do i realy need a timer running all the time, while with this api one calls aquire and the timer is running.

On the other hand aquire and release are the api that is most complete.

kfessel avatar Nov 02 '22 09:11 kfessel