binaryen
binaryen copied to clipboard
Remove stringview types and instructions
The stringview types from the stringref proposal have three irregularities that
break common invariants and require pervasive special casing to handle properly:
they are supertypes of none
but not subtypes of any
, they cannot be the
targets of casts, and they cannot be used to construct nullable references. At
the same time, the stringref proposal has been superseded by the imported
strings proposal, which does not have these irregularities. The cost of
maintaing and improving our support for stringview types is no longer worth the
benefit of supporting them.
Simplify the code base by entirely removing the stringview types and related instructions that do not have analogues in the imported strings proposal and do not make sense in the absense of stringviews.
Three remaining instructions, stringview_wtf16.get_codeunit
,
stringview_wtf16.slice
, and stringview_wtf16.length
take stringview operands
in the stringref proposal but cannot be removed because they lower to operations
from the imported strings proposal. These instructions are changed to take
stringref operands in Binaryen IR, and to allow a graceful upgrade path for
users of these instructions, the text parser still accepts but ignores
string.as_wtf16
, which is the instruction used to convert stringrefs to
stringviews.
No attempt is made to fix up the binary output to include the string.as_wtf16
instructions that the stringref proposal requires, so binaryen no longer emits
valid string code unless --string-lowering
is used to target the imported
strings proposal instead. This should not be a problem because users should
universally be using imported strings over stringref.
Future PRs will further align binaryen with the imported strings proposal
instead of the stringref proposal, for example by making string
a subtype of
extern
instead of a subtype of any
and by removing additional instructions
that do not have analogues in the imported strings proposal.
This stack of pull requests is managed by Graphite. Learn more about stacking.
Join @tlively and the rest of your teammates on Graphite
The fuzzer won't be happy with this until it supports --string-lowering
and/or --string-lowering-magic-imports
and we can force one of those passes to be run whenever strings are enabled.
Alternatively, we could start performing string lowering automatically in the binary writer since we no longer emit valid stringref code anway.
Alternatively, we could actually try to emit the necessary string.as_wtf16
instruction in the binary writer so we still emit valid stringref code. I was hoping to avoid this complication, but it's probably the best solution for now. I'll investigate this more tomorrow.
Injecting string.as_wtf16
in the binary writer requires a lot of the same infrastructure you're working on to fix up br_if
, so I think we should either land this after that infrastructure lands, or land this sooner and accept temporary fuzzer breakage.
Never mind, the last commit now implements injecting the conversions into the binary along with the necessary scratch locals.
I do think it's important for now to emit validating strings logic as you mention the latest version does, as we can't fully fuzz lowered strings logic yet - we'd need to handle the split-out constants and maybe other stuff.
There is a risk here if the strings proposal starts back up again. I would not be entirely surprised if it does. In that case we may need to restore some of this, but given V8 has been making the changes (like non-nullability) that led to this PR, I guess whatever new form a strings proposal would take would be different anyhow.
Merge activity
@tlively The fuzzer found a regression from this PR:
(module
(func $test (export "test") (result stringref)
(local $0 i32)
;; Slice [0:1), which returns "h".
(stringview_wtf16.slice
(string.const "hello")
(local.get $0)
(local.tee $0
(i32.const 1)
)
)
)
)
After this landed, the following errors:
$ bin/wasm-opt w.wat -all --fuzz-exec --roundtrip
[fuzz-exec] calling test
[fuzz-exec] note result: test => string("h")
[fuzz-exec] calling test
[fuzz-exec] note result: test => string("")
[fuzz-exec] comparing test
values not identical! string("") != string("h")
[fuzz-exec] optimization passes changed results
Looking into this now.