sys/stdio: Precision on semantics, and fixes
Contribution description
stdio_write has been used inconsistently with respect to whether or not incomplete writes were returned.
This
- describes in more detail how stdio_write behaves, clarifying corner cases
- introduces a helper stdio_write_all for the common case that some code is not prepared to handle incomplete writes
- applies that helper in a few places (and documents in others why it is not needed)
- makes the dispatch version retry (because that way it works in the presence of backends that do partial writes)
It does not yet attempt to verify that all stdio implementations behave as desired.
Testing procedure
Most of this is documentation updates and API usage correctness where I wouldn't know how to produce any different behavior. I'd leave it at that and set "CI: run tests" unless reviewers insist (and maybe have concrete suggestions).
Issues/PRs references
This came out of discussions around https://github.com/RIOT-OS/RIOT/pull/19738, and builds on it.
There was earlier discussion on matrix.
Checking the existing implementations:
- Didn't check for any implementation yet how it behaves in interrupts (but it's a "should"...).
- stdio_rtt, when the buffer is full, will return 0. That has not been allowed before either -- now there is the concrete suggestion that it should block in this case (but the non-blocking stdio_rtt may need overhauling anyway, because discussions indicated that it should be always blocking, unless build-time switched to a mode when it always ever does best effort)
- similar with the nimble version (if the buffer is full, it will just busyloop when someone tries to write everything)
- usbus and tinyusb cdc_acm_stdio could do away with the loop (because the caller will do that)
It does not yet attempt to verify that all stdio implementations behave as desired.
Should we maybe add a warning to the docs, until the stdio implementations have been confirmed to adhere to the specification?
Rebased to address conflicts (now that the base https://github.com/RIOT-OS/RIOT/pull/19738 has been merged).
Should we maybe add a warning to the docs, until the stdio implementations have been confirmed to adhere to the specification?
I think that's adequately covered by the "should" all over the place. <grumpy reason="no lunch yet">If we started placing warnings on all RIOT APIs where there are implementations that subtly misbehave, we'll have very red documentation with very little gained, I'd rather spend the time fixing impls.</grumpy>
Murdock results
:x: FAILED
0bcba79f1b10fe2ad2cd959d508ab842d2dce888 fixup! treewide: Replace uses of stdio_write with stdio_write_all
| Success | Failures | Total | Runtime |
|---|---|---|---|
| 224 | 0 | 9025 | 01m:04s |
Artifacts
I think this PR is a bit odd.
The root problem is that stdio_write() returns anything at all.
If this was solved by calling stdio_write() in a loop, the stdio implementation could have done so already (before there was the high level / low level API split).
There are either stdio implementations that will always write everything (e.g. stdio_uart) or those that fill a buffer and expect another thread to deliver it. In the latter case, a simple loop will do no good if you don't yield control to the other thread - or we keep it the way it is now and accept that with some exotic backends under rare circumstances some stdio can be lost if large amounts of text are printed at once.
To me this would be preferable to introducing weird edge cases that might lead to a hang.
The reason the simple loop does work is that stdio_write has never been allowed to return 0 -- at latest when a driver would want to return 0, it'd need to block, and would have needed to block without this PR just as well.
I stand corrected: It was allowed to return 0 from the API description, but that case was never actionable. (Like, what is the caller supposed to do?). Not being allowed to return 0 (except when stdiout was closed) is just what one would expect the API to do coming from a POSIX mindset.
I see two ways towards an API that can be used:
- Do this.
- Declare that stdio_write will always write the full string, block if needed, unless used from an interrupt.
- Forbid writing from interrupts (in which case the above become similar enough that the distinction barely matters).
As it is now, we're having callers to stdout writing all over the place that ignore error conditions, and may arbitrarily lose data.
stdio in RIOT is just a debug tool. It should be reasonably reliable, but still kept as simple as possible. I don't think we need anything on top of that to prevent 'data loss'.
It's not just debug output -- that's what the debug.h is.
stdio is also what every single new user comes across when trying out RIOT the first time (as used in the shell), likely part of their first programs, and part of the data extraction of every RIOT based publication I've been involved with so far (with weird bugs showing up there because awk scripts processing the data tripped over gaps).
Having an interface there that changes its behavior depending on the backends, has no clear guidance on how to use it, and is not used consistently throughout the code base, does not make a good first impression, and causes pain down the road as well.
Any chance to move this forward for the upcoming release?
In the current form, this PR will cause regressions when stdio is used from ISR.