RIOT icon indicating copy to clipboard operation
RIOT copied to clipboard

fmt: unify and align wording regarding characters, digits, and bytes

Open miri64 opened this issue 3 years ago • 3 comments

Contribution description

The current wording of the fmt module may be a bit confusing (see https://github.com/RIOT-OS/RIOT/pull/18310). This PR aims to fix that.

Testing procedure

Read.

Issues/PRs references

Alternative to https://github.com/RIOT-OS/RIOT/pull/18310

miri64 avatar Jul 18 '22 08:07 miri64

Can we postpone this until I'm back from holidays? No time to argue now :) I find using "characters" utterly confusing. We're in total "one byte has eight bits" land, and so, fmt writes bytes to buffers. (It maybe prints characters).

kaspar030 avatar Jul 18 '22 08:07 kaspar030

I find using "characters" utterly confusing. We're in total "one byte has eight bits" land, and so, fmt writes bytes to buffers.

Mh... the type of all out parameters for which I changed to "characters" are of type char *. Typically, in C this is a string (of characters) not a buffer (of bytes) so I do not understand your argument.

Can we postpone this until I'm back from holidays? No time to argue now :)

Sure!

miri64 avatar Jul 18 '22 08:07 miri64

Huh what's confusing about using characters for string output? I wouldn't have expected this to be controversial.

benpicco avatar Jul 18 '22 09:07 benpicco

Huh what's confusing about using characters for string output?

because characters nowadays often mean multi-byte characters. 'Į' is a character. When talking about how much data is written to a byte buffer, numbering "bytes" is just more precise.

kaspar030 avatar Sep 01 '22 16:09 kaspar030

because characters nowadays often mean multi-byte characters. 'Į' is a character. When talking about how much data is written to a byte buffer, numbering "bytes" is just more precise.

But that's not how these functions work. They provide char as an output/input, not 8-bit wide units, commonly called bytes (which is uint8_t in C). The width of char is platform dependent (though it is commonly understood it is also 8-bit ASCII characters). What you are talking about is UTF-8 characters, but those are a completely different type in C. Lastly, since this is about converting between bytes and their string representation, sticking to "bytes" for both input and output is confusing.

miri64 avatar Sep 01 '22 17:09 miri64

ok

kaspar030 avatar Sep 01 '22 19:09 kaspar030