provide clearer advice about USVString vs. DOMString
In https://github.com/w3ctag/spec-reviews/issues/87#issuecomment-171535440 (see also the following comment) we had a brief discussion about USVString vs. DOMString. This seems like a somewhat tricky (in that it's easy to get wrong) API design issue, and the current wording in WebIDL isn't particularly clear.
http://heycam.github.io/webidl/#idl-USVString currently says:
Specifications SHOULD only use USVString for APIs that perform text processing and need a string of Unicode scalar values to operate on. Most APIs that use strings should instead be using DOMString, which does not make any interpretations of the code units in the string. When in doubt, use DOMString.
This is a bit unclear in a few ways (not clear what "text processing" means; does encoding conversion count?), and doesn't seem to match the other advice being given, e.g., in https://github.com/w3ctag/spec-reviews/issues/87#issuecomment-171575546
This seems like an area where giving clear and consistent advice is important.
The plan is to rephrase this in terms of https://infra.spec.whatwg.org/#strings and use JavaScript (for DOMString) and scalar value string (for USVString) from there. Is the difference sufficiently clear there?
@annevk I think you’re response is about what each of the two types are, while @dbaron’s comment was about why and in what situations is one recommended over the other. I agree that it is unclear where “APIs that perform text processing” applies.
No, it specifically calls out UTF-8 encode as a place where USVString applies. I haven't really countered another place, though the source does mention Rust subsystems as something where this might make sense, but since Rust isn't used on a large scale that's not super relevant yet.
So concretely, I think we should point out that USVString should be used for URLs or for strings that are to be sent "over the wire". I'm not sure what other scenarios there might be.
For what it’s worth, Firefox’s ongoing style system rewrite uses UTF-8 strings internally. This effectively makes CSSOM strings be USVString. CSSWG resolved to allow both, so https://github.com/w3c/csswg-drafts/pull/1266 introduced a CSSOMString typedef.
Right... I don't think we want to propogate such implementation-specific quirks throughout the platform via Web IDL advice.
Right, CSSOMString probably does not have its place in this WebIDL spec note. This was only a response to "not sure what other scenarios there might be".