binaryen icon indicating copy to clipboard operation
binaryen copied to clipboard

wasm-split fuzzing and reusing an existing Table

Open kripken opened this issue 3 weeks ago • 4 comments

I seem to recall we discussed using a new Table in wasm-split, when reference types is set? Atm it looks like we reuse the table if one exists:

https://github.com/WebAssembly/binaryen/blob/28e849b91fab4fb697c27206b7d6f2c090e3519c/src/ir/module-splitting.cpp#L186-L191

The fuzzer errored on this, with exports like these:

 (func $0
 )
 (func $1
  (table.set $0
   (i32.const 1)
   (ref.null nofunc)
  )
  (unreachable)
 )
 (func $2
 )

After splitting these three out, we end up with call_indirects in all three in the primary module. The secondary module's elem writes the proper function pointers, but when we call $1 we trample some of that data, leading to the third export trapping. That is, it is unsafe to use the table for normal stuff and also wasm-split stuff, without the two being aware of each other.

cc @tlively @aheejin

kripken avatar Dec 08 '25 16:12 kripken

This logic is responsible for skipping the reuse of an existing table and will cause a new table to be created on demand:

https://github.com/WebAssembly/binaryen/blob/main/src/ir/module-splitting.cpp#L175-L178

The fuzzer should make sure that condition is always satisfied to ensure it always gets a fresh table.

tlively avatar Dec 09 '25 19:12 tlively

@tlively Ah, I see 😄 Looks like that checks for an emscripten-style table import, and if it exists, reuses the table. And we do have that in our test suite... so we would need to skip those files, but (1) it is several files, and (2) they are emscripten testcases, which we may add more of, so this seems like a long-term issue.

Why do we need that special logic? Is there a problem with emscripten using a second table?

kripken avatar Dec 09 '25 20:12 kripken

@dschuff added the special casing of emscripten tables in https://github.com/WebAssembly/binaryen/pull/7050. It sounds like Emscripten's loading code couldn't handle the second table at the time. @aheejin, is this something we could fix on the Emscripten side so we can remove the special case here?

tlively avatar Dec 09 '25 23:12 tlively

I'm not familiar with why that was done that way wrt dynamic loading. @dschuff Can you point to more context?

aheejin avatar Dec 10 '25 07:12 aheejin

When loading a split module, the loader code around here patches the indirect function table, overwriting the pointers to the proxy with the real functions. When wasm-split creates a second table for this purpose the loader code just needs to figure out which table to patch.

dschuff avatar Dec 13 '25 00:12 dschuff