binaryen icon indicating copy to clipboard operation
binaryen copied to clipboard

Remove stringview types and instructions

Open tlively opened this issue 9 months ago • 4 comments

The stringview types from the stringref proposal have three irregularities that break common invariants and require pervasive special casing to handle properly: they are supertypes of none but not subtypes of any, they cannot be the targets of casts, and they cannot be used to construct nullable references. At the same time, the stringref proposal has been superseded by the imported strings proposal, which does not have these irregularities. The cost of maintaing and improving our support for stringview types is no longer worth the benefit of supporting them.

Simplify the code base by entirely removing the stringview types and related instructions that do not have analogues in the imported strings proposal and do not make sense in the absense of stringviews.

Three remaining instructions, stringview_wtf16.get_codeunit, stringview_wtf16.slice, and stringview_wtf16.length take stringview operands in the stringref proposal but cannot be removed because they lower to operations from the imported strings proposal. These instructions are changed to take stringref operands in Binaryen IR, and to allow a graceful upgrade path for users of these instructions, the text parser still accepts but ignores string.as_wtf16, which is the instruction used to convert stringrefs to stringviews.

No attempt is made to fix up the binary output to include the string.as_wtf16 instructions that the stringref proposal requires, so binaryen no longer emits valid string code unless --string-lowering is used to target the imported strings proposal instead. This should not be a problem because users should universally be using imported strings over stringref.

Future PRs will further align binaryen with the imported strings proposal instead of the stringref proposal, for example by making string a subtype of extern instead of a subtype of any and by removing additional instructions that do not have analogues in the imported strings proposal.

tlively avatar May 10 '24 05:05 tlively

  • #6589 Graphite
  • #6579 Graphite 👈
  • #6583 Graphite
  • main

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @tlively and the rest of your teammates on Graphite Graphite

tlively avatar May 10 '24 05:05 tlively

The fuzzer won't be happy with this until it supports --string-lowering and/or --string-lowering-magic-imports and we can force one of those passes to be run whenever strings are enabled.

Alternatively, we could start performing string lowering automatically in the binary writer since we no longer emit valid stringref code anway.

Alternatively, we could actually try to emit the necessary string.as_wtf16 instruction in the binary writer so we still emit valid stringref code. I was hoping to avoid this complication, but it's probably the best solution for now. I'll investigate this more tomorrow.

tlively avatar May 10 '24 05:05 tlively

Injecting string.as_wtf16 in the binary writer requires a lot of the same infrastructure you're working on to fix up br_if, so I think we should either land this after that infrastructure lands, or land this sooner and accept temporary fuzzer breakage.

tlively avatar May 10 '24 16:05 tlively

Never mind, the last commit now implements injecting the conversions into the binary along with the necessary scratch locals.

tlively avatar May 12 '24 07:05 tlively

I do think it's important for now to emit validating strings logic as you mention the latest version does, as we can't fully fuzz lowered strings logic yet - we'd need to handle the split-out constants and maybe other stuff.

kripken avatar May 13 '24 19:05 kripken

There is a risk here if the strings proposal starts back up again. I would not be entirely surprised if it does. In that case we may need to restore some of this, but given V8 has been making the changes (like non-nullability) that led to this PR, I guess whatever new form a strings proposal would take would be different anyhow.

kripken avatar May 13 '24 19:05 kripken

Merge activity

  • May 15, 3:05 PM EDT: @tlively started a stack merge that includes this pull request via Graphite.
  • May 15, 3:05 PM EDT: @tlively merged this pull request with Graphite.

tlively avatar May 15 '24 19:05 tlively

@tlively The fuzzer found a regression from this PR:

(module
 (func $test (export "test") (result stringref)
  (local $0 i32)
  ;; Slice [0:1), which returns "h".
  (stringview_wtf16.slice
   (string.const "hello")
   (local.get $0)
   (local.tee $0
    (i32.const 1)
   )
  )
 )
)

After this landed, the following errors:

$ bin/wasm-opt w.wat -all --fuzz-exec --roundtrip
[fuzz-exec] calling test
[fuzz-exec] note result: test => string("h")
[fuzz-exec] calling test
[fuzz-exec] note result: test => string("")
[fuzz-exec] comparing test
values not identical! string("") != string("h")
[fuzz-exec] optimization passes changed results

kripken avatar Jun 03 '24 18:06 kripken

Looking into this now.

tlively avatar Jun 11 '24 18:06 tlively