stringref
stringref copied to clipboard
Does `string.is_usv_sequence` pull its weight?
(string.is_usv_sequence str) is the same as (i32.eq -1 (string.measure_utf8 str)). Is it useful enough to keep in an MVP?
The case for keeping it: in a component-model scenario where there is an inter-component function call to an interface taking a string, and that both sides actually implement strings with stringrefs, in that case we can pass the stringref value directly without copying -- but only if the string has no isolated surrogates. We don't actually need to compute the WTF-8 length in that case and can just rely on the same internal bit that isUSVString would use.
After having implemented this in V8, I think that at least there it is unlikely that we will have a isUSVString bit and will instead have to scan contents of the string, unless the string happens to have the one-byte optimization (all codepoints less than 256). I think therefore that I would propose that we remove this instruction unless there is a proven need for it; its functionality can be had via string.measure_utf8.
Strings with the one-byte optimization are common enough to make the one-byte optimization itself worthwhile, even if no engine adds an isUSVString bit. Wouldn't that be sufficient to make this instruction worth considering?
Good point @sunfishcode. An optimizer could recognize the compare-to-negative-1 pattern, of course, but best to just emit the operation we're looking for.