webidl icon indicating copy to clipboard operation
webidl copied to clipboard

provide clearer advice about USVString vs. DOMString

Open dbaron opened this issue 10 years ago • 11 comments

In https://github.com/w3ctag/spec-reviews/issues/87#issuecomment-171535440 (see also the following comment) we had a brief discussion about USVString vs. DOMString. This seems like a somewhat tricky (in that it's easy to get wrong) API design issue, and the current wording in WebIDL isn't particularly clear.

http://heycam.github.io/webidl/#idl-USVString currently says:

Specifications SHOULD only use USVString for APIs that perform text processing and need a string of Unicode scalar values to operate on. Most APIs that use strings should instead be using DOMString, which does not make any interpretations of the code units in the string. When in doubt, use DOMString.

This is a bit unclear in a few ways (not clear what "text processing" means; does encoding conversion count?), and doesn't seem to match the other advice being given, e.g., in https://github.com/w3ctag/spec-reviews/issues/87#issuecomment-171575546

This seems like an area where giving clear and consistent advice is important.

dbaron avatar Jan 14 '16 22:01 dbaron

The plan is to rephrase this in terms of https://infra.spec.whatwg.org/#strings and use JavaScript (for DOMString) and scalar value string (for USVString) from there. Is the difference sufficiently clear there?

annevk avatar Mar 29 '17 13:03 annevk

@annevk I think you’re response is about what each of the two types are, while @dbaron’s comment was about why and in what situations is one recommended over the other. I agree that it is unclear where “APIs that perform text processing” applies.

SimonSapin avatar Apr 15 '17 22:04 SimonSapin

No, it specifically calls out UTF-8 encode as a place where USVString applies. I haven't really countered another place, though the source does mention Rust subsystems as something where this might make sense, but since Rust isn't used on a large scale that's not super relevant yet.

annevk avatar Apr 16 '17 04:04 annevk

So concretely, I think we should point out that USVString should be used for URLs or for strings that are to be sent "over the wire". I'm not sure what other scenarios there might be.

domenic avatar May 11 '17 17:05 domenic

For what it’s worth, Firefox’s ongoing style system rewrite uses UTF-8 strings internally. This effectively makes CSSOM strings be USVString. CSSWG resolved to allow both, so https://github.com/w3c/csswg-drafts/pull/1266 introduced a CSSOMString typedef.

SimonSapin avatar May 11 '17 17:05 SimonSapin

Right... I don't think we want to propogate such implementation-specific quirks throughout the platform via Web IDL advice.

domenic avatar May 11 '17 17:05 domenic

Right, CSSOMString probably does not have its place in this WebIDL spec note. This was only a response to "not sure what other scenarios there might be".

SimonSapin avatar May 11 '17 18:05 SimonSapin