wabt
wabt copied to clipboard
wasm2c: uvwasi support
Hi friends, @keithw @sbc100
Here is a first cut at providing uvwasi bindings for wasm2c.
Inspiration and code started as part of wasm2native.
wasm2c/examples/hello-wasi - shows how we can use this with a simple application that exercises wasi. wasm2c/examples/build-standalone - provides a script that takes an arbitrary wasm file and builds it with wasi bindings so it can run as a normal stand-alone binary.
There are a few things that still need to be done, security checks on arguments to wasi calls, some documentation, but I think it's at the point where feedback would be helpful.
This looks quite interesting! Big-picture questions:
- Who is this for / what are some intended use-cases?
- Is this helpful for RLBox or Firefox?
- Does WASI have any tests? How should we make sure this is working and keeps working?
- If the goal is similar to wasm2native, I wonder if there's a simplification where instead of using preprocessor macros to parameterize on the module name, what if instead there's just a
wasm2native
shell script that runswasm2c --module-name=wasi-command
or something, and then the main.c program can have that hardcoded...? - Does the wasm2native author have a view on us effectively upstreaming this?
Smaller stuff:
- Instead of directly accessing the w2c_memory internal member of the instance struct, probably this should access it via the export. WASI commands are required to export their memory as "memory," so calling
Z_modnameZ_memory(instance)
will get you the exported memory. - I'm not sure we want to pollute the global namespace with all the Wasm type names (u8, s8, etc.) for anybody who includes wasm-rt.h. It might be better for uvwasi-rt.c to use standard C type names.
- Where does "Z_wasi_unstable" come from -- are there WASI modules in the wild that import from this modname?
- I'm a little unsure if we want to include the hello-wasi Makefile and C source here (i.e. a build system that compiles a C program into a WASI command) -- that feels like a bigger enlargement in scope and leads me to start wanting to check, e.g., whether these CFLAGS are really needed by clang "to play nice with wasm2c" (they haven't been in my experience) but that doesn't even really feel like a WABT issue. Maybe this belongs more in a wasi-sdk tutorial, or somewhere that produces WASI commands? What are your thoughts?
Hi Keith,
Thanks for the quick feedback and thoughtful comments!
This looks quite interesting! Big-picture questions:
- Who is this for / what are some intended use-cases?
People who like running code that makes system calls =)
On several current projects students needed to be able to run existing libraries and applications, but couldn't, because of the lack of a WASI support in wasm2c (e.g. because said libraries/applications needed to access the filesystem). I went looking for a WASI implementation and came across wasm2native, so I forked an older version that still supported (an older version of) wasm2c, made some small changes, and students have been making use of that.
Being able to just compile and run existing code with wasi-libc and wasi makes life so much easier, and most other wasm toolchains (e.g. WAMR, Wasmtime) already supports this.
uvwasi seems to be a reasonable wasi implementation, node.js uses it, WAMR uses it, and its a third party dependency in wasm2c already.
- Is this helpful for RLBox or Firefox?
I think there is plenty of stuff one might want to sandbox today with RLBox where you need some WASI support, and this is a reasonable solution for that.
- Does WASI have any tests? How should we make sure this is working and keeps working?
Just started looking at that yesterday. There are a couple of wasi test suites out there in various stages of usefulness. Some of that discussion is here:
https://github.com/WebAssembly/WASI/issues/9
There doesn't seem to be one answer that folks have converged on.
node.js seems to have a decent little collection of tests but the driver seems a little more complicated then I what I want, since I'm just testing a binding, and I don't think wasm2c wants to depend on node. My first thought was to just replace their driver with a simple python test runner, and use the tests I need to give me adequate coverage.
So, not 100% clear on the path forward (I have a couple other ideas), but plan to have that worked out soon.
- If the goal is similar to wasm2native, I wonder if there's a simplification where instead of using preprocessor macros to parameterize on the module name, what if instead there's just a
wasm2native
shell script that runswasm2c --module-name=wasi-command
or something, and then the main.c program can have that hardcoded...?
Yup, that sounds good. Removing pre-processor stuff and hard coding names 👍 .
Moving wasm2native into /scripts and out of examples probably makes sense, then it can just be a thing that people can make use of, the additional parameter to name the resulting application sounds good.
- Does the wasm2native author have a view on us effectively upstreaming this?
I would be glad to know. I reached out to @vshymanskyy a couple days ago by email but haven't heard back yet. wasm2native abandoned wasm2c support sometime back in favor of w2c2, so I'm guessing its not a priority.
Smaller stuff:
- Instead of directly accessing the w2c_memory internal member of the instance struct, probably this should access it via the export. WASI commands are required to export their memory as "memory," so calling
Z_modnameZ_memory(instance)
will get you the exported memory.
👍
- I'm not sure we want to pollute the global namespace with all the Wasm type names (u8, s8, etc.) for anybody who includes wasm-rt.h. It might be better for uvwasi-rt.c to use standard C type names.
Sure, that seems like a good suggestion.
- Where does "Z_wasi_unstable" come from -- are there WASI modules in the wild that import from this modname?
Honestly, it was one of the few things that was there when I grabbed this code where I was like, huh, that's gross, I should probably look into that at some point, and moved on. There is probably some backstory there that I'm not aware of.
I just grepped through wasi-libc and it looks like its only using the preview1 symbols. I would be happy to rip out that extra pre-processor cruft.
- I'm a little unsure if we want to include the hello-wasi Makefile and C source here (i.e. a build system that compiles a C >program into a WASI command) -- that feels like a bigger enlargement in scope and leads me to start wanting to check, >e.g., whether these CFLAGS are really needed by clang "to play nice with wasm2c" (they haven't been in my experience) >but that doesn't even really feel like a WABT issue.
None of those flags are needed for this simple example, they just got carried along with the boilerplate from a previous makefile I used for this =/
Maybe this belongs more in a wasi-sdk tutorial, or somewhere that produces WASI commands? What are your thoughts?
I think taking out the stuff that builds .c => wasm makes sense for the example as it eliminates that dependency on the wasi-sdk. I think making this simpler and keeping it as the wasi example makes sense, and then turning build-standalone into wasm2native as a script also sounds good.
Can you rebase now that the bulk memory change has landed?
Can you rebase now that the bulk memory change has landed?
Sure.
@sbc100 sorry about the delayed response, I was filtering my git notifications into a lower priority folder because of a time in the past when there was a lot of traffic.
I have responded to all of @keithw's comments, except for the need for a test suite. I'm a little unsure of where to put this, since its a little bit out of step with the rest of the wabt tests. The wasm-c-api test script looks good, but those tests live in a third_part repo. If you have thoughts on where you would be happy with a directory of .c tests, that would be helpful, or I can put them in a submodule, whatever seems right.
I think aside from this, adding some security checks to uvwasi-rt.c, and updating the README is all I have left on my list.
On the testsuite, I think anything that gives confidence that the code works as intended (and will be robust to well-meaning but unintentionally buggy refactors) sounds fine to me. Maybe some options are:
-
a bunch of end-to-end tests where each one contains
- A "starting contents" of a directory
- A .wat file for a Wasm module that's a WASI command
- An intended "ending contents" of the directory
and the test jig verifies that if you run the WASI command in the starting directory, you end up with the intended ending contents. I guess this only really works for testing calls that manipulate the filesystem.
-
tests against a mock/stub version of uvwasi, where the test contains:
- A .wat file for a Wasm module that's a WASI command
- Some sort of test script that defines what uvwasi calls should be getting called, and what their return values are supposed to be
and the test jig verifies that if you run the WASI command, you get the correct sequence of calls to the mock uvwasi (the real uvwasi wouldn't be used here).
Probably the tests should include some "evil" WASI commands that try to put naughty parameters in (to make sure these are correctly caught).
On the testsuite, I think anything that gives confidence that the code works as intended (and will be robust to well-meaning but unintentionally buggy refactors) sounds fine to me. Maybe some options are:
a bunch of end-to-end tests where each one contains
- A "starting contents" of a directory
- A .wat file for a Wasm module that's a WASI command
- An intended "ending contents" of the directory
and the test jig verifies that if you run the WASI command in the starting directory, you end up with the intended ending contents. I guess this only really works for testing calls that manipulate the filesystem.
tests against a mock/stub version of uvwasi, where the test contains:
- A .wat file for a Wasm module that's a WASI command
- Some sort of test script that defines what uvwasi calls should be getting called, and what their return values are supposed to be
and the test jig verifies that if you run the WASI command, you get the correct sequence of calls to the mock uvwasi (the real uvwasi wouldn't be used here).
Probably the tests should include some "evil" WASI commands that try to put naughty parameters in (to make sure these are correctly caught).
If we could avoid using wat, and instead just have a C test suite that would be much less work for me to write, and for others to maintain. My thought was just to adapt tests from the node test suite (to remove node depedencies).
https://github.com/nodejs/node/tree/main/test/wasi/c
Each test could be stand-alone, consisting of a set of calls and some assertions to test that things are correct. The test runner could be a small variation on what wabt/test/run-c-api-examples.py does.
For the purposes of testing with malicous parameters, we could certain add tests that call wasi api's directly.
That sounds fine to me -- I guess the plan would be that the testsuite first compiles all these .c files to .wasm by using emscripten? If there's a pre-existing collection of WASI tests in C, I agree that's way better than writing them anew.
I don't think it's a problem to put these tests in this repo in some subdir (maybe test/wasm2c/wasi
?). Then I guess it's a question of making the appropriate edits to run-tests.py to add a new "tool" (or maybe just a new argument to run-spec-wasm2c.py
?) that compiles the C to Wasm, runs wasm2c, links against uvwasi-rt.o, runs the program, and verifies that the stdout is as expected and the program exits successfully. Which is already basically what run-spec-wasm2c.py does today, except for the C->Wasm compilation and linking against unwasi-rt.o...
(Separately, somehow we do have to deal with the build failures in CI to get this merged...)
That sounds fine to me -- I guess the plan would be that the testsuite first compiles all these .c files to .wasm by using emscripten? If there's a pre-existing collection of WASI tests in C, I agree that's way better than writing them anew.
Cool, sounds good. I think just using the clang/llvm compiler that comes as part of the wasi-sdk will be simpler. It seems like adding it as a dependency to the README would be the simplest way to include it.
I don't think it's a problem to put these tests in this repo in some subdir (maybe
test/wasm2c/wasi
?).
Cool, will do.
Then I guess it's a question of making the appropriate edits to run-tests.py to add a new "tool" (or maybe just a new >argument to
run-spec-wasm2c.py
?) that compiles the C to Wasm, runs wasm2c, links against uvwasi-rt.o, runs the program, and verifies that the stdout is as expected and the program exits successfully. Which is already basically what run-spec-wasm2c.py does today, except for the C->Wasm compilation and linking against unwasi-rt.o...
I was thinking to just copy and lightly modify wabt/test/run-c-api-examples.py e.g. run-uvwasi-tests.py, and adding another custom build target to CMakeLists.txt just like the run-c-api-tests. The only difference being that we can have it run a make file to build the tests before it runs them.
(Separately, somehow we do have to deal with the build failures in CI to get this merged...)
For sure.
wasi-sdk is great too, but, where do we pull that in from for the CI tests? (With emscripten I know there's an emsdk:latest Docker repo that wabt already uses...)
That sounds fine to me -- I guess the plan would be that the testsuite first compiles all these .c files to .wasm by using >emscripten? If there's a pre-existing collection of WASI tests in C, I agree that's way better than writing them anew.
Cool, I think I would just stick with the clang that comes with the wasi-sdk. No need to get involved with emscripten.
I don't think it's a problem to put these tests in this repo in some subdir (maybe
test/wasm2c/wasi
?). Then I guess it's a question of making the appropriate edits to run-tests.py to add a new "tool" (and maybe a new argument torun-spec-wasm2c.py
) that compiles the C to Wasm, runs wasm2c, links against uvwasi-rt.o, runs the program, and verifies that the output is as expected. (Which is already quite similar to what run-spec-wasm2c.py does except for the C->Wasm compilation and linking against unwasi-rt.o.)(Separately, somehow we do have to deal with the Windows build failures in CI to get this merged...)
wasi-sdk is great too, but, where do we pull that in from for the CI tests? (With emscripten I know there's an emsdk:latest Docker repo that wabt already uses...)
https://github.com/WebAssembly/wasi-sdk/releases
I'd rather not depend on emscripten or wasi-sdk here.
I think having them checked in here as wat files makes more sense, perhaps with script that can be used to update them using wasi-sdk?
Also, I don't think its important to run the entire wasi test suite, at least not initially. Just one or two basic tests to start with I think should be fine. Once we are happy with the approach we can consider expanding to a larger wasi test suite. When we do this we should co-ordinate the wasm-interp which can also run wasi and it would be good to test that method using the same test suite.
Does the wasm2native author have a view on us effectively upstreaming this?
Hey! I'm not opposed to this at all. Of course a proper attribution should be included.
Let me know if you have any questions!
I'd rather not depend on emscripten or wasi-sdk here.
I think having them checked in here as wat files makes more sense, perhaps with script that can be used to update them using wasi-sdk?
Also, I don't think its important to run the entire wasi test suite, at least not initially. Just one or two basic tests to start with I think should be fine. Once we are happy with the approach we can consider expanding to a larger wasi test suite. When we do this we should co-ordinate the wasm-interp which can also run wasi and it would be good to test that method using the same test suite.
This all sounds good to me.