binaryen icon indicating copy to clipboard operation
binaryen copied to clipboard

wasm-opt error when closed-world option enabled

Open hondogo opened this issue 1 year ago • 16 comments
trafficstars

Binaryen version: 119

When executing wasm-opt with enabled option closed-world got: Fatal: Internal GlobalTypeRewriter build error: Heap type has an invalid supertype at index 1

Command that execute: wasm-opt --enable-gc --enable-reference-types --enable-exception-handling --enable-bulk-memory --enable-nontrapping-float-to-int --closed-world --type-ssa input.wasm -O3 -o optimized.wasm

Execution is successful when performed without option closed-world.

In attachment there is input.zip file (archive with input.wasm) for which this command was executed. input.zip

hondogo avatar Oct 17 '24 07:10 hondogo

Reduced testcase:

(module
  (rec
    (type $parent (sub (struct (field (ref func)))))
    (type $child1 (sub $parent (struct (field (ref func)))))
    (type $child2 (sub $parent (struct (field (ref func)))))

    (type $func (func (param $child1 (ref $child1)) (param $child2 (ref $child2))))
  )

  (func $func (export "func") (type $func) (param $child1 (ref $child1)) (param $child2 (ref $child2))
    (drop
      (struct.new $parent
        (ref.func $func)
      )
    )
    (drop
      (struct.new $child1
        (ref.func $func)
      )
    )
    (drop
      (struct.new $child2
        (ref.func $func)
      )
    )
    (drop (struct.get $child2 0 (local.get $child2)))
  )
)

What happens here is that the entire rec group is public because of that function, which is exported, and which then makes the entire rec group public. GTO is not noticing that it is public and optimizing, which leads to making those struct.news all by new_defaults, since it wants to remove the fields. But one field is non-nullable, which errors, as the field is not actually removed - TypeUpdater does notice the type is public.

We can make GTO avoid public types, and I'll do that as a bugfix. However, separately, on this testcase that would mean GTO will do nothing at all, as there is just one huge rec group, which is a big loss. I was hoping that --minimize-rec-groups would help here, but it doesn't - running it on that reduced testcase does nothing. I was hoping it would split out the type $func which is not recursive with anything - @tlively do you know why it isn't splitting there?

kripken avatar Oct 17 '24 18:10 kripken

#7017 resolves one issue here, but I also see an error on SignaturePruning.

@hondogo Separately from fixing these bugs, if you can avoid mixing the types of exported functions in the big rec group, then you will get much better results (as it will keep the big rec group private, and modifiable).

kripken avatar Oct 17 '24 18:10 kripken

@kripken Thanks for advise! This code was generated by Kotlin compiler. I've forwarded your advise to Kotlin team.

hondogo avatar Oct 17 '24 19:10 hondogo

--minimize-rec-groups does not modify public types, so it can't do anything here.

tlively avatar Oct 17 '24 21:10 tlively

@tlively Oh, right, thanks. That can't help here then.

I guess in the long term if we add a way to mark a rec group as private then we could optimize using that (so a private rec group might contain public types, which we would not modify, but other ones we could).

kripken avatar Oct 17 '24 21:10 kripken

Maybe, but even then modifying the types would change the module's external types, so it still wouldn't be safe. The real solution is for the producer to avoid this situation when originally producing the module.

Note, however, that Kotlin is currently forced into this situation where all the types are in the same rec group as the public types because it is a hack to work around failing our closed world validation. The real real solution is for us to make the changes described in #6965.

tlively avatar Oct 17 '24 21:10 tlively

Wait, how does putting all the types in a single rec group work as a hack for a closed world validation issue?

kripken avatar Oct 17 '24 21:10 kripken

The validation allows all types in the same rec group as a public function type to be public without complaint. This might be a bug, but I would say the existence of the validation is a bigger bug. Is a bug in a feature that is a bug a feature?

tlively avatar Oct 17 '24 21:10 tlively

Is a bug in a feature that is a bug a feature?

:rofl:

Yeah, I guess that validation is kind of weird. Removing it as part of #6965 sgtm. In fact, if it is blocking people right now, removing the validation by itself seems fine?

kripken avatar Oct 17 '24 22:10 kripken

I'd be worried about getting an avalanche of fuzz bugs, but other than that, I agree that removing the validation as a first step would be nice.

tlively avatar Oct 17 '24 22:10 tlively

Sounds good, see #7019 for removing that validation.

kripken avatar Oct 17 '24 22:10 kripken

@kripken Could you please hint to me which declarations in the original input.wasm led to the bug? Could you please make a release with a fix?

bashor avatar Oct 17 '24 23:10 bashor

@bashor One fix landed so far, another is open in https://github.com/WebAssembly/binaryen/pull/7018, and there is at least one more error after that lands that I haven't diagnosed yet. I can make a release when they all land.

I don't think there is anything to do on your side here. This code is just hitting some Binaryen bugs because we didn't have much real-world content that mixed public and private types in closed-world mode (e.g. Java uses closed-world but doesn't mix, and some other users mix but don't use closed-world). We'll just fix those bugs in Binaryen.

However, if you can avoid mixing public and private types in a single rec group, that will help. If closed-world validation errors were what forced you to do that, then https://github.com/WebAssembly/binaryen/pull/7019 should help once it lands.

By "mixed public and private types in a single rec group" I mean things like the example in the second comment. There the function type could have been in its own singleton rec group outside, which would have left all the private types together, where they could be optimized.

kripken avatar Oct 17 '24 23:10 kripken

With #7022, #7018, #7019 (all not yet landed), the testcase here finally passed in closed world (even including the extra validation checks of pass-debug mode).

To get an idea of the benefit of closed world here, I ran -O3 --gufa -O3 --gufa -O3 -tnh with and without it. Code size is 6% smaller with closed world. (There may also be speed benefits.)

kripken avatar Oct 18 '24 16:10 kripken

I also did an experiment where I

  • Moved the types of _initialize and startUnitTests out of the main rec group, and
  • Removed the exports of main and __callFunction_(()->Unit) (just moving their types wouldn't be enough, as their signatures contain other types).

As a result, there are practically no public types remaining.

When I then optimize with vs without --closed-world, that flag makes the binary 48% smaller - almost half the size.

@bashor Based on that I think there can be large benefits to emitting as few public types as possible in the Kotlin compiler.

kripken avatar Oct 18 '24 20:10 kripken

@bashor Based on that I think there can be large benefits to emitting as few public types as possible in the Kotlin compiler.

In particular, this can be accomplished by using anyref instead of concrete reference types in the signatures of exported functions.

tlively avatar Oct 18 '24 21:10 tlively

@kripken @tlively thanks, folks for the hints, we will investigate that.

bashor avatar Oct 24 '24 20:10 bashor

@kripken, could you please publish a new release, since the PRs mentioned above are merged?

bashor avatar Oct 24 '24 20:10 bashor

It looks like it is better to wait for the merging of #7031, right?

bashor avatar Oct 24 '24 20:10 bashor

That's merged now 👍

tlively avatar Oct 24 '24 20:10 tlively

Great, thanks! Could we please have a release now?

bashor avatar Oct 24 '24 20:10 bashor

In particular, this can be accomplished by using anyref instead of concrete reference types in the signatures of exported functions.

@tlively using structref is also ok, right?

bashor avatar Oct 24 '24 21:10 bashor

Moved the types of _initialize and startUnitTests out of the main rec group, and

@kripken aren't they already defined outside of the big rec group?

bashor avatar Oct 24 '24 21:10 bashor

__callFunction_(()->Unit) couldn't be removed in this case since it's used to call kotlin FunctionX from JS function wrapper created by __convertKotlinClosureToJsClosure_(()->Unit).

bashor avatar Oct 24 '24 21:10 bashor

@kripken, am I right that the original problem was caused by the exported main function and its type?

bashor avatar Oct 24 '24 21:10 bashor

In particular, this can be accomplished by using anyref instead of concrete reference types in the signatures of exported functions.

@tlively using structref is also ok, right?

Yes, that would be fine as well.

(For the rest, @kripken is out of office today but will be back tomorrow)

tlively avatar Oct 24 '24 22:10 tlively

@bashor Yes, it looks like $main was the actual problem. I removed all the type annotations from all exported functions earlier, but now I see that was more than was needed.

I added a release now, 120: https://github.com/WebAssembly/binaryen/releases/tag/version_120

kripken avatar Oct 25 '24 19:10 kripken

Looks like all the issues here have been resolved. If there are any new issues found with public types, please open new issues.

tlively avatar Jan 13 '25 20:01 tlively