binaryen [WIP] Lower memory.copy

Oct 18 '24 00:10 dschuff

@kripken I haven't actually tested this for real yet, but can you give it a quick look-over for general sanity? I've not got a lot of experience using these APIs.

Oct 18 '24 22:10 dschuff

Hmph. I don't know if I agree with clang-format on the best way to format this, but I guess it's best to keep everything conforming :D

Oct 18 '24 23:10 dschuff

Ah, so here is where we have to decide how general-purpose we want the pass to be. If this is just for lowering away LLVM-produced memcpy and supporting older browsers, then we don't need to support multi-memory, memory64, or threads (because browsers that support those always support bulk memory). Not supporting threads means that the order of byte copies doesn't have to match the spec, because it's not observable. If we further assume that clang is the source of the memory ops, then we don't have to worry about pointer overflow because it's UB. For now I'm pretty happy with the former assumption because supporting more feature interactions would add a fair amount of complexity for no known use case. The latter I'm less sure about. I suppose it's possible we'd be running this lowering over mixed C+Rust code?

Oct 18 '24 23:10 dschuff

Those sound like reasonable assumptions to me.

But IIANM the order of copying matters when the ranges overlap, too:

[----------]
   [----------]

Copying the former over the latter in the forward direction is fine, but in reverse will smear.

But maybe LLVM never emits a memory.copy where this might matter? Does it lower llvm.memcpy and/or llvm.memmove?

Btw, an issue I realized with limiting these passes to LLVM's output is that we can't run the spec tests on them or fuzz them, not normally at least.

Oct 18 '24 23:10 kripken

Ah, good point about the order. LLVM does lower both to memory.copy (see WebAssemblySelectionDAGInfo.cpp). Yeah, my way of testing was just to run the passes unconditionally and send them through all of the emscripten tests. But that's probably not as good as including spec tests or fuzzing.

Oct 18 '24 23:10 dschuff

Btw, an issue I realized with limiting these passes to LLVM's output is that we can't run the spec tests on them or fuzz them, not normally at least.

I realized that another way to implement the trapping behavior inside wasm32 would be to extend the addresses to i64 and do the trap check in i64. Obviously it would add more complexity; do you think that would be worth it?

Nov 08 '24 18:11 dschuff

I don't follow, how would extending to i64 help here? Would we be able to check things we don't currently (what)?

Nov 08 '24 18:11 kripken

Sorry, I was thinking of the comments above about possible overflow when checking for out of bounds trapping. Currently we don't handle that because UB, but if we wanted to run spec tests or fuzz, we'd presumably need to handle traps that also overflow the i32 range.

Nov 08 '24 18:11 dschuff

I see. We can also test for that overflow in 32-bit (x + y < x implies overflow if both are unsigned), but I don't know what's simpler.

I'm also not sure if full fuzzing and spec tests are worth it. We can get some fuzzing through emcc (using csmith to fuzz from C code).

Nov 08 '24 18:11 kripken

Yeah my thinking was that extending to i64 would be simpler than adding the extra condition you mentioned for both src and dst. But I'm also leaning toward not bothering with the spec tests. I'm running all the emscripten tests with the pass enabled (and I'm considering adding another test mode that builds for MVP wasm which would have them enabled continuously? WDYT about that?) To get extra testing beyond the emscripten tests, I'm also going to run them with the LLVM testsuite tests.

Nov 08 '24 18:11 dschuff

OK, assuming we keep the current trapping behavior, and the current structure (i.e. of generating rather than merging the polyfill functions) I think this is ready.

Nov 08 '24 18:11 dschuff

I'm running all the emscripten tests with the pass enabled (and I'm considering adding another test mode that builds for MVP wasm which would have them enabled continuously? WDYT about that?) To get extra testing beyond the emscripten tests, I'm also going to run them with the LLVM testsuite tests.

I like the idea to add an MVP test mode. Off on CircleCI, just running on the testsuite bot?

Nov 08 '24 19:11 kripken

I like the idea to add an MVP test mode. Off on CircleCI, just running on the testsuite bot?

Yeah, along with the ASan and similar modes.

Nov 08 '24 19:11 dschuff

I think we could add some execution tests for this, under test/lit/exec/. The tests can contain some copy/fill operations, and print memory contents after the operation. By running them with --lower --fuzz-exec we would verify that the output is unchanged after the lowering.

I'm trying this out, but I'm not sure how to print the memory. The execution engine in wasm-opt doesn't seem to have the same magic spectest.print imports that wasm-shell has.

Nov 12 '24 22:11 dschuff

You can call the logging imports, see

https://github.com/WebAssembly/binaryen/blob/52cae1142bb8c06c98c4887ef9ffefff70dbeefc/test/lit/exec/fuzzing-api.wast#L25-L35

and their definitions earlier in that file.

Nov 12 '24 22:11 kripken

Thanks, WDYT of the memory copy test? It's derived from the wast spectest but turned into a fuzz-exec test.

Nov 12 '24 23:11 dschuff

Test looks great! lgtm with the same for memory.fill.

Nov 13 '24 00:11 kripken