RIOT icon indicating copy to clipboard operation
RIOT copied to clipboard

sys/wait: add module for backend agnostic waits

Open maribu opened this issue 2 years ago • 15 comments

Contribution description

This adds to sets of delay functions. The first kind consists of

  • wait_at_least_us(): delay execution in units of microseconds
  • wait_at_least_ms(): delay execution in units of milliseconds
  • wait_at_least_s(): delay execution in units of seconds

These will use ztimer as backend, if used anyways. Otherwise a trivial CPU counting loop is used.

The second kind consists of

  • wait_us_spinning(): busy-wait for a given number of microseconds

The benefit of this compared to ztimer_spin() is that for AVR a platform specific implementation is used that is more accurate. It allows writing drivers like the DHT11/DHT2x driver without platform specific code. Other low end platforms can follow suit and also add more accurate waiting code.

Testing procedure

A test application has been added with a python script doing automatic testing. In addition, that test app performs short pulses on the LED that should be visible via a scope / logic analyzer. Finally, it will estimate the CPU cycles required per loop iteration that can be used to fine-tune the loop.

Issues/PRs references

Deprecates: https://github.com/RIOT-OS/RIOT/pull/17330

Useful for: https://github.com/RIOT-OS/RIOT/pull/19718

maribu avatar Jun 08 '23 12:06 maribu

Again, I don't think that delay_us() should fall back to msec.

kaspar030 avatar Jun 08 '23 13:06 kaspar030

I dropped delay_us()

maribu avatar Jun 08 '23 15:06 maribu

I dropped delay_us()

I think that would be useful. Can't it just use ZTIMER_USEC?

"Those are trivial wrappers over ztimer_sleep() that just use the most appropriate clock that is used anyway. It gives up control over the most suitable trade-off between resolution, accuracy (clock drift) and power consumption for some flexibility / ROM size."

Repeating some arguments from #17330 ...

Didn't If they are "trivial" wrappers, make them use the corresponding clock. delay_us(N) -> ztimer_sleep(ZTIMER_USEC, N) etc.

Making this depending on which module is already used is awful, as adding modules to an already tested part of an application might add more timer modules, making the behavior change. That is especially bad for delay_s(), which might more often be used in human scale small numbers.

Like, "delay_s(2)" is +0..2seconds when using ZTIMER_SEC, +0..2 milliseconds when using ZTIMER_MSEC.

So for the timer-based delay functions, why not just:

delay_us(N) -> ztimer_sleep(ZTIMER_USEC, N) delay_ms(N) -> ztimer_sleep(ZTIMER_MSEC, N) delay_s(N) -> _ztimer_sleep_scale_up(ZTIMER_MSEC, secs, MS_PER_SEC) (as ZTIMER_SEC is usually implemented using ZTIMER_MSEC anyways)

Alternatively, document clearly that the delays are +0..2, and where that matters, the higher resolution delay should be used.

But please, don't make potentially hand-tuned timings change just because adding USEMODULE += some_other_module pulls in ztimer_msec.

kaspar030 avatar Jun 08 '23 19:06 kaspar030

 * For use cases such as waiting 1 ms for a device to reset during start up it
 * matters little if the delay ends up being 1 ms, 2 ms or even 10 ms. 

What's wrong with just using ZTIMER_MSEC for that?

kaspar030 avatar Jun 09 '23 08:06 kaspar030

What's wrong with just using ZTIMER_MSEC for that?

That may increase ROM size quite significantly by pulling in periph_rtt. That's not something that a driver developer wants to force upon all users, when all the driver cares about is that some method of delay is there.

E.g. see

https://github.com/RIOT-OS/RIOT/blob/35d3896d47daf2af30e22b0d3a3880f7932e2a46/cpu/gd32v/periph/i2c.c#L627-L639

https://github.com/RIOT-OS/RIOT/blob/35d3896d47daf2af30e22b0d3a3880f7932e2a46/cpu/sam0_common/sam0_sdhc/sdhc.c#L119-L126

maribu avatar Jun 09 '23 08:06 maribu

Summary of the out of band discussion:

  • name it wait_at_least()
    • delay_ms() is known as accurate wait in the Arduino universe, but the provided API is not accurate --> may deceive users into believing the API is accurate
    • sleep_ms() could be read as "low power friendly", because low power modes often have "sleep" or sleep related phrases in it (e.g. deep sleep, hibernate, ...)
    • wait doesn't ring any low power or high accuracy bells and the at_least() postfix makes it extra obvious
  • the accurate version is called wait_us_spinning(). (Note the missing _at_least() here. While still it will be at least, the overshoot here is small and bounded, when used correctly. The at_least() variants instead have a laissez-faire attitude to overshoots.)
  • falling back to ZTIMER_USEC is not acceptable
  • falling back to CPU cycle counting is

I'll update this PR to match that

maribu avatar Jun 09 '23 10:06 maribu

falling back to ZTIMER_USEC is not acceptable

Why?

benpicco avatar Jun 09 '23 10:06 benpicco

falling back to ZTIMER_USEC is not acceptable

Why?

To be honest, I would be in favor of falling back to ZTIMER_USEC. But I'd rather have this in without falling back to ZTIMER_USEC then having PR going stale. (You could do a follow up PR, though :wink:)

maribu avatar Jun 09 '23 11:06 maribu

Murdock results

:x: FAILED

225c5ddadae2cc186f558c32f5bd9428ff4cddb8 examples/blinky: make use of sys/wait

Success Failures Total Runtime
5551 0 6947 08m:18s

Artifacts

riot-ci avatar Jun 09 '23 18:06 riot-ci

I still don't see any argument against using ZTIMER_USEC, where was that brought up? Maybe we should just add the spinning fallback to xtimer_{u,m}sleep() and use those through the compat API wrapper where we retcon xtimer to mean any timer.

benpicco avatar Jun 12 '23 09:06 benpicco

@maribu Maybe splilt out the more accurate arch specific spinning?

kaspar030 avatar Jun 12 '23 10:06 kaspar030

@maribu Maybe splilt out the more accurate arch specific spinning?

No need anymore to put that on a fast track. The DHT driver rewrite uses a different approach now that no longer requires ztimer_spin() anyway and also works on AVR as well.

And since there is no need to rush anyway, I'll re-add falling back to a different ztimer backend.

maribu avatar Jun 12 '23 10:06 maribu

I still don't see any argument against using ZTIMER_USEC, where was that brought up? Maybe we should just add the spinning fallback to xtimer_{u,m}sleep() and use those through the compat API wrapper where we retcon _x_timer to mean any timer.

So my main gripe with this PR is the vagueness of the API.

Our "delay the current process for a while" API is ztimer. It allows arbitrary backends to be chosen. There are a couple of backends that the build system tries hard to find an implementation for (usec, msec, sec). There are multiple options for each backend (msec -> rtt, msec->usec, ...). ztimer is quantized on the clock ticks, and provides at least semantics with +0-2 guarantees. It's unsuitable for very short sleeps (<5us - <100us, depending on hardware) due to being interrupt based (meaning a context switch is always involved).

So ztimer provides the functionality that this API wraps. For "wait at least N ms (N >2)", just use ztimer_msec() and be done. If the application is interested in low power, the user will have configured ztimer_msec to use a low-power clock. For "wait at least N us (with N > 5), just use ztimer_usec and be done with it. Won't be low power.

"but now there's a ztimer dependency". Well. Let's make sure to get one ztimer clock that suits most needs (like, ztimer_msec), and use that as much as possible. Chances are, any sufficiently large application will use ztimer_msec somewhere. Any sufficiently minimal application won't feel the ram/rom cost.

So maybe typing ztimer_sleep(ZTIMER_MSEC, N) gets old, so having delay/wait/sleep_ms(N) *mapping to ztimer_sleep(ZTIMER_MSEC, N) does make some sense.

Why not chose any available backend? Because the call site suddenly doesn't clearly state what the function call is doing. We're basically introducing ztimer_sleep_or_maybe_spin_maybe_lowpower_with_maybe_+0..2_us_or_maybe_+0..2_ms_accuracy(). The API has uint32_t value range, so a call to wait_ms() might spin for 49 days. Or just set the thread to sleep and by extension the whole MCU. Depending on the dependency tree.

What way of "waiting" is actually used is already quite muddy when using ZTIMER_M/USEC, as there's already complex heuristics and configuration at play. This layer of abstraction adds even more muddyness. The tree of used waiting mechanisms (and watering of guarantees) grows to a forest.

For the use case most stated ("driver needs to wait a bit for the device to be ready, a bit longer doesn't matter"), if spinning is acceptable, that must be assumed to actually be used in any case, because if spinning would not be acceptable, use of wait_ms() in its current form is a bug. Because depending on the dependencies, spinning might happen. So we must be able to replace all ẁait_ms() with spin_ms() without breakage (or unacceptable behavior), because that happens implicitly if spin_ms() is one of the fallback methods. But if spin_ms() is "acceptable behavior", why not just use that? -> The doxygen states "if the application developer enabled a low power clock anyway, this can be taken as a hint that low power sleeps are preferable". That is already being dealt with at ztimer_msec level. The same argumentation goes with low power or accuracy. the worst incarnation of all possible outcomes (of what wait_ms...() actually calls) must be acceptable everywhere that function is used, and must be tested. (This is a high level version of "possible UB makes a case impossible so let's drop this code").

Is ztimer too fat still (implementation wise)? Is that a sentiment, or an actual issue? Because RAM/ROM use would be the only reason for not just using ztimer sleeps.

My suggestion here is:

  • provide delay_ms/us(). Map directly to the corresponding ztimer clock. Document that it quantizes (delay_ms/us(N) starts in the middle of a clock tick) and is generally +0..2 ticks.
  • provide "spin_us(uint8_t us)" for bit-banging needs (using this PR's per-CPU implementations)
  • replace the ztimer sleeps system wide with the corresponding delay calls
  • create a nice overview page that answers the question on which sleep function to use with delay_ms(N).
  • overall, simplify the amount of user-facing APIs in place, and the amount of dependency tree dependent behaviors
  • accept doing USEMODULE += delay_ms() even for simple drivers that need it
  • accept doing spin_us() in simple driver's boot up one time initialization

Otherwise, prove that using ztimer direct wrapping (with no fallbacks) is causing too many RAM/ROM issues in places where the delay_ms() couldn't be replaced with spin_us() without causing otherwise unacceptable behavior.

kaspar030 avatar Jun 15 '23 11:06 kaspar030

Because RAM/ROM use would be the only reason for not just using ztimer sleeps.

We had some productive discussion at today's Friday meeting. Karl brought up two more aspects:

  • while ztimer has "at least" semantics, it actually also advertizes "+-1 tick systemic inaccuracy". Meaning, to get actual "at least", the user needs to add one to the desired period.

  • also, the "low resource AVR" use case was discussed, where the MCU might not have the means to pull in any ztimer This could be mitigated by a simplified alternative spinning ztimer implementation:

enum {
  ZTIMER_USEC,
  ZTIMER_MSEC,
  ZTIMER_SEC
};

ztimer_sleep(clock, ticks) {
  switch clock {
     ZTIMER_USEC: {
       spin_us(ticks)}
       break;
       ZTIMER_MSEC: {
         for (unsigned i = 0; i < 1000, i++) {
           ztimer_sleep(ZTIMER_USEC, ticks);
           }
           break;
    ...
    }
}

(or sth smarter)

kaspar030 avatar Jun 16 '23 08:06 kaspar030

A spinning implementation in ztimer_sleep isn't helpfull for the read ablity of ztimer_sleep, which should just wrap a ztimer_set and required data (a ztimer_t and a mutex) to make it work (we should a void having any alternative there)

that is the reason to put it somewhere else, keep ztimer_sleep simple

not having an implicit spin is one of the big advancements of ztimer over xtimer (the other one is not having a absolute time interface both these features keep ztimer simple and able to perform ist power saving)

kfessel avatar Jun 16 '23 09:06 kfessel