mps
mps copied to clipboard
MPS does not support WebAssembly
This is a bug to track the addition of WebAssembly support.
There's at least 3 potential targets here, I think that MPS should support 2 in the end:
- emscripten
- wasi
In both cases, the CPU is 'WebAssembly' and the compiler is most likely to be clang (llvm).
What should the platform name be for each?
Right now, I have:
+#elif defined(__EMSCRIPTEN__)
+#if defined(CONFIG_PF_STRING) && ! defined(CONFIG_PF_EMI3LL)
+#error "specified CONFIG_PF_... inconsistent with detected emi3ll"
+#endif
+#define MPS_PF_EMI3LL
+#define MPS_PF_STRING "emi3ll"
+#define MPS_OS_EM
+#define MPS_ARCH_I3
+#define MPS_BUILD_LL
+#define MPS_T_WORD unsigned long
+#define MPS_T_ULONGEST unsigned long
+#define MPS_WORD_WIDTH 32
+#define MPS_WORD_SHIFT 5
+#define MPS_PF_ALIGN 4
(similar to what I had before)
Clearly the i3
part is wrong ...
Tell me what you want here and I'll carry on with my changes (which so far work at least minimally to run the scheme example).
The architecture code selects architecture-specific code for things like stack scanning and the mutator context. See for example prmclii6.c . So if there is likely to be more than one similar "WebAssembly" architecture (e.g. a 64-bit variant), then pick a letter for the family and '3' for the 32-bit variant. Sadly 'W' and 'A' are already taken! J for Javascript?
You'll need an OS code to select modules with memory mapping, locking, threading primitives, etc. (see Functional Modules). Again, are they all the same or more than one? What is the run-time environment called?
Please let us know if anything in Porting the MPS is unclear. We could fix it up as we go on your branch.
mps.c spells out how source code is selected based on platform code. You'll be adding sets to that.
I note that Porting the MPS doesn't mention CI and we definitely want CI coverage.
mps.c spells out how source code is selected based on platform code. You'll be adding sets to that.
Already doing that. That's how I got the scheme code running. :)
Great! If you have a working branch in a fork you could make a draft pull request to show stuff if you like.
This is to record some research and experimentation, not a request for feedback or further action at this point.
There are 2 main platforms for using WebAssembly at this point in time.
Emscripten allows compiling a wide set of C / C++ to run within the web browser and standalone VMs.
There's also WASI (http://wasi.dev) which is standardizing a number of system interfaces for use from within WebAssembly, particularly for use outside of the browser.
It is useful to just consider these as 2 separate "OSes" running atop the WebAssembly "CPU".
I've been doing my work with Emscripten and compiling there seems to work with some patches which I will submit as a draft PR in due course.
Looking at WASI, there's a separate SDK for this and it has different restrictions than Emscripten. The current restrictions are described at https://github.com/WebAssembly/wasi-sdk/blob/7ef7e948fa8c12e61c78b9584b29b7a41c611740/README.md#notable-limitations
The main one here that impacts this now is that setjmp
is not supported yet. This is used as part of stack scanning in ss.h
(mps.design.stack-scan).
The WASI SDK is available as a docker image, so I used that to get an idea of how things work and what might happen. Using this to just try a quick build of code/mps.c
results in this:
❯ docker run -v `pwd`:/src -w /src ghcr.io/webassembly/wasi-sdk clang-15 --sysroot=/wasi-sysroot -c code/mps.c -o mps-wasi.o
In file included from code/mps.c:33:
In file included from code/mpsi.c:47:
In file included from code/mpm.h:31:
code/ss.h:24:10: fatal error: 'setjmp.h' file not found
#include <setjmp.h>
^~~~~~~~~~
1 error generated.
For now, I will stick with my investigation of things on the Emscripten side, where I have already seen things work.
The currently shipping versions of WebAssembly use 32 bit pointers and assume a (less than) 32 bit address space.
There's a 64 bit version of WebAssembly being worked on which uses 64 bit pointers and an address space larger than 32 bits.
For Emscripten, this can be targeted with the command line parameter: -s MEMORY64=1
Target detection via preprocessor check for these:
-
__EMSCRIPTEN__
: Emscripten is being used. -
__wasm__
: This is a WebAssembly target. -
__wasm32__
: This is a 32 bit WebAssembly target. -
__wasm64__
: This is a 64 bit WebAssembly target.
The latter 3 are implemented in clang by the calls to defineCPUMacros
in https://github.com/llvm/llvm-project/blob/df07a35912d78781ed6a62a7c032bfef5085a4f5/clang/lib/Basic/Targets/WebAssembly.cpp
The use of __EMSCRIPTEN__
is documented: https://emscripten.org/docs/compiling/Building-Projects.html
Also, if pthreads support is enabled in Emscripten, then __EMSCRIPTEN_PTHREADS__
will be defined. Otherwise, threading is not available.
The main one here that impacts this now is that
setjmp
is not supported yet. This is used as part of stack scanning inss.h
(mps.design.stack-scan).
See #38 and job004158 although it's likely that setjmp
will still make sense for some platforms.
Emscripten may or may not support threads, based on whether or not the right compiler flags are given:
This compiles without threads by default:
emcc -o mps.o -c code/mps.c
But this will have threads:
emcc -s USE_PTHREADS=1 -o mps.o -c code/mps.c
I don't want to use a separate target entirely for threads vs not-threads, so I only have a single block in code/mps.c
to include the right files, and I can build with it including lockix.c
.
lockix.c
will fall back to not supporting threads if CONFIG_THREAD_SINGLE
is defined, which changes things so that LOCK
is not defined and LOCK_NONE
is. In that case, it switches to the lockan.c
implementation automatically.
Is this an acceptable use / abuse of CONFIG_THREAD_SINGLE
?
#elif defined(__EMSCRIPTEN__) && defined(__wasm32__)
#if defined(CONFIG_PF_STRING) && ! defined(CONFIG_PF_EMJ3LL)
#error "specified CONFIG_PF_... inconsistent with detected emj3ll"
#endif
#define MPS_PF_EMJ3LL
#define MPS_PF_STRING "emj3ll"
#define MPS_OS_EM
#define MPS_ARCH_J3
#define MPS_BUILD_LL
#define MPS_T_WORD unsigned long
#define MPS_T_ULONGEST unsigned long
#define MPS_WORD_WIDTH 32
#define MPS_WORD_SHIFT 5
#define MPS_PF_ALIGN 4
#if !defined(__EMSCRIPTEN_PTHREADS__)
#define CONFIG_THREAD_SINGLE
#endif
WebAssembly doesn't yet have the concept of mprotect
or signals. This will make threading more difficult, although perhaps still possible via emscripten's ASYNCIFY
mode (but that will require research).
As such, the initial port to WebAssembly will be single-threaded and non-incremental.
Following up on the above, it is worth referencing https://github.com/WebAssembly/design/issues/1459 which has some relevant discussion about GC in WebAssembly and some discussion of threads.
A client is interested in exploring the possibility of building the MPS for WebAssembly in order to support a software product on many platforms. The purpose of this work is currently to work out what's feasible.
The feasibility work is essential. The support is not currently.
Good thing I already have it running then! In boring single-threaded, non-incremental mode.
Good thing I already have it running then! In boring single-threaded, non-incremental mode.
Indeed! I will talk to you about what it would take to go further and then I can write that up for the client.
Note for further discussion: The idea that it might be possible to simulate shields within a WebAssembly framework. This came up in discussion with a client earlier this week.