webauthn icon indicating copy to clipboard operation
webauthn copied to clipboard

[Editorial] Truncation description inaccurate

Open aphillips opened this issue 4 years ago • 6 comments

6.4.2. Language and Direction Encoding https://www.w3.org/TR/webauthn-2/#sctn-strings-langdir

Consumers of strings that may have language and direction encoded should be aware that truncation could truncate a language tag into a different, but still valid, language. The final directionality marker or CANCEL TAG code point provide an unambigous indication of truncation.

Naive truncation of a language tag will not produce a valid language tag. Language tags will only remain valid if truncated just before a hyphen character (and noting that single-character or "singleton" subtags should not appear at the end of a language tag). It is also possible that a badly handled truncation scheme could change the meaning of a tag. For example:

tlh => tl (from Klingon to Tagalog) hi-Deva = hi-De (from Hindi-written-in-Devanagari to Hindi-as-used-in-Germany)

A proper truncation here should describe using U+E002D (the equivalent of the hyphen character in language tags) to find subtags for removal.

Note that while the CANCEL TAG's absence probably doesn't introduce any rendering issues, note that the resulting strings concatenation with other strings could result in strange or unintended rendering.

The term "valid" may also be problematic here, since in BCP47 a language tag is valid if and only if each subtag has been checked for existence in the registry. The normal term of art here is "well-formed".

aphillips avatar Jul 09 '21 17:07 aphillips

I think this issue results from a misunderstanding.

The truncation in question here is the result of authenticators blindly truncating these fields at a given byte length. Since the language tag is at the end, such truncation could transform a language tag into another language, like the Klingon to Tagalog change that you note. The fact that a directionality marker or CANCEL TAG always follows the language allows for an implementation to be aware that truncation occurred and thus potentially ignore the language tag.

agl avatar Jul 19 '21 22:07 agl