blink icon indicating copy to clipboard operation
blink copied to clipboard

Porting to webassembly

Open saolsen opened this issue 2 years ago • 87 comments

Hey,

I'm thinking about trying to get blink running in the browser (via webassembly). My goal isn't just to run c code in the browser (could just use emscripten for that) but to actually run ape x64 executables in an interpreter where I can build a debugger and some visualization tools to see what's happening. (Similar to what blink already does). I'm wondering if anybody has tried to compile blink for webassembly or what you think some of the challenges would be.

saolsen avatar Dec 04 '22 16:12 saolsen

Sounds awesome. I haven't tried it myself. I imagine the only major obstacle might be 32-bit. JavaScript only supports 32-bit integers. Blink runs on i386 and other 32-bit CPUs, and is regularly tested on them. However Blink was created in a 64-bit world, and as such, Blink has a 64-bit bias, therefore, you might not get optimal performance out of Blink in your 32-bit environments.

jart avatar Dec 04 '22 21:12 jart

This would be awesome!!!

unicomp21 avatar Jan 04 '23 09:01 unicomp21

https://github.com/WebAssembly/tail-call/issues/15#issuecomment-1357820841

We're still waiting on tail calls for webassembly, lol. Wish there was another performant option.

unicomp21 avatar Jan 04 '23 10:01 unicomp21

Someday we'll be able to use co_await in webassembly, hope I'm still around to see it happen, lol.

https://github.com/WebAssembly/tail-call/issues/14#issue-1058389687

unicomp21 avatar Jan 04 '23 10:01 unicomp21

https://github.com/emscripten-core/emscripten/issues/10991#issuecomment-974226917

unicomp21 avatar Jan 04 '23 10:01 unicomp21

Blink doesn't need tail calls.

Also, good news! Blink is now stable on 32-bit platforms. I don't know if webassembly is multi-core, but the threading issues Blink was having earlier on 32-bit have been resolved.

Your biggest obstacle is most likely going to be completely replacing everything in blink/syscall.c so it interfaces with WASI or something similar instead.

jart avatar Jan 04 '23 10:01 jart

https://github.com/copy/v86

unicomp21 avatar Jan 04 '23 10:01 unicomp21

@gornishanov @tlively @madmongo1 @tomoliv30 any ideas on easiest way to implement linux syscall.c for webassembly?

https://github.com/jart/blink/issues/8#issuecomment-1370735621

https://github.com/WebKit/WebKit/pull/2065

unicomp21 avatar Jan 04 '23 10:01 unicomp21

Dumb/crazy question, what sort of perf hit if blink runs nested within itself? I'm wondering if the outer vm could implement syscall.c for the inner vm? And handle threading etc. using co_await? Then outer vm could provide stupid simple interfaces for tunneling packets, etc.?

unicomp21 avatar Jan 04 '23 11:01 unicomp21

It's possible to run Blink within Blink. There's noticeable slowdown, but it's not a showstopper. There's caveats, such as you can only have a single of the nested Blink instances make use of the linear memory optimization. The other nestings need to pass blink -m to turn it off, otherwise the memory allocations will collide.

jart avatar Jan 04 '23 14:01 jart

any ideas on easiest way to implement linux syscall.c for webassembly?

emscripten provides a surprisingly wide set of runtime APIs which might even be enough to "just work" already.

Vogtinator avatar Jan 04 '23 14:01 Vogtinator

Also, good news! Blink is now stable on 32-bit platforms. I don't know if webassembly is multi-core, but the threading issues Blink was having earlier on 32-bit have been resolved.

With web workers and SharedArrayBuffers it's possibly to effectively have threads and emscripten even exposes those through pthreads as much as possible.

any ideas on easiest way to implement linux syscall.c for webassembly?

emscripten provides a surprisingly wide set of runtime APIs which might even be enough to "just work" already.

I gave it a quick try and it does almost build out-of-the-box with just emmake make -j8. It just complains about sa_len and wait4 missing. With those worked around I get a blink.wasm of unknown quality, not tested.

Vogtinator avatar Jan 04 '23 15:01 Vogtinator

I got it to compile too. When I tried to run Blink in Node, this happened, and I have no idea what it means.

master jart@turfwar:~/blink$ node o//blink/blink
/home/jart/blink/o/blink/blink:2148
  function ___invoke_$struct_Machine*_$struct_System*_$struct_Machine*(
                                    ^

SyntaxError: Unexpected token '*'
    at wrapSafe (internal/modules/cjs/loader.js:915:16)
    at Module._compile (internal/modules/cjs/loader.js:963:27)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1027:10)
    at Module.load (internal/modules/cjs/loader.js:863:32)
    at Function.Module._load (internal/modules/cjs/loader.js:708:14)
    at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:60:12)
    at internal/main/run_main_module.js:17:47

I configured the Makefile to build an HTML file and ran it in the browser. I got the same thing.

image

I'm going to push the fixes I made right now. Could you please help me figure out what's wrong?

jart avatar Jan 04 '23 17:01 jart

You might need newer LLVM or emscripten: https://github.com/emscripten-core/emscripten/issues/12551

I'm using emscripten main with clang 15.0.6 (should be 16 mmeanwhile, but it works :shrug:)

Vogtinator avatar Jan 04 '23 17:01 Vogtinator

Works!

/tmp/blink> node o/blink/blink -m /cwd/third_party/cosmo/tinyhello.elf
hello world

The WASM page size is 64KiB, which means that https://github.com/jart/blink/issues/14 happens, but by doing the awful hack of just pretending that it's actually 4096 it can even run busybox-static from the host here:

/tmp/blink> (cd /usr/bin; node /tmp/blink/o/blink/blink -m /cwd/busybox-static sh -c "echo Hello world!")
I2023-01-04T19:25:47.197000:blink/syscall.c:2610:42 missing syscall 0x111
I2023-01-04T19:25:47.198000:blink/syscall.c:2610:42 missing syscall 0x14e
warning: unsupported syscall: __syscall_prlimit64

I2023-01-04T19:25:47.200000:blink/syscall.c:1857:42 getrandom() flags not supported yet
Hello world!

It did require some hacks and workarounds though:

  • Expose the working directory in the virtual filesystem below /cwd
  • emscripten does not pass envp to main, use environ instead (crashes in LoadArgv otherwise)
diff --git a/blink/blink.c b/blink/blink.c
index f6c0506..a241fd0 100644
--- a/blink/blink.c
+++ b/blink/blink.c
@@ -155,8 +155,10 @@ static void HandleSigs(void) {
   unassert(!sigaction(SIGSEGV, &sa, 0));
 #endif
 }
-
+#include <emscripten.h>
 int main(int argc, char *argv[], char **envp) {
+  EM_ASM({FS.mkdir('/cwd'); FS.mount(NODEFS, {root : '.'}, '/cwd');});
+  if(!envp) envp = environ;
   g_blink_path = argc > 0 ? argv[0] : 0;
   GetOpts(argc, argv);
   if (optind_ == argc) PrintUsage(argc, argv, 48, 2);
  • Function pointers work differently, they don't really contain an address so comparing > 4096 fails:
diff --git a/blink/debug.h b/blink/debug.h
index a996413..5e64aa9 100644
--- a/blink/debug.h
+++ b/blink/debug.h
@@ -10,7 +10,7 @@
 #define IB(x)                      \
   __extension__({                  \
     __typeof__(x) x_ = (x);        \
-    unassert((intptr_t)x_ > 4096); \
+    unassert((intptr_t)x_ > 0); \
     x_;                            \
   })
 #else
  • emscripten's mmap does not support address hints:
diff --git a/blink/memorymalloc.c b/blink/memorymalloc.c
index 3d0113b..b0b9266 100644
--- a/blink/memorymalloc.c
+++ b/blink/memorymalloc.c
@@ -64,6 +64,10 @@ void FreeBig(void *p, size_t n) {
 
 void *AllocateBig(size_t n) {
   void *p;
+#ifdef __EMSCRIPTEN__
+  p = Mmap(NULL, n, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0, "big");
+  return p != MAP_FAILED ? p : 0;
+#endif
   u8 *brk;
   if (!(brk = atomic_load_explicit(&g_allocator.brk, memory_order_relaxed))) {
     // we're going to politely ask the kernel for addresses starting

I built it with emmake make -j8 && emcc -g o//blink/blink.o o//blink/blink.a -lm -pthread -lrt -o o//blink/blink -s INITIAL_MEMORY=1073741824 -s EXIT_RUNTIME=1 -lnodefs.js

Vogtinator avatar Jan 04 '23 18:01 Vogtinator

Wow. I'm still catching up. Quick question. Did you have any problems with wait4? My build is complaining about that being undefined.

jart avatar Jan 04 '23 18:01 jart

I made a bunch more changes and I'm now blocked on this error.

master jart@turfwar:~/blink$ node o//blink/blink
requested a shared WebAssembly.Memory but the returned buffer is not a SharedArrayBuffer, indicating that while the browser has SharedArrayBuffer it does not have WebAssembly threads support - you may need to set a flag
(on node you may need: --experimental-wasm-threads --experimental-wasm-bulk-memory and/or recent version)
/home/jart/blink/o/blink/blink:163
      throw ex;
      ^

Error: bad memory
    at Object.<anonymous> (/home/jart/blink/o/blink/blink:820:13)
    at Module._compile (internal/modules/cjs/loader.js:1085:14)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1114:10)
    at Module.load (internal/modules/cjs/loader.js:950:32)
    at Function.Module._load (internal/modules/cjs/loader.js:790:12)
    at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:76:12)
    at internal/main/run_main_module.js:17:47
$?=7 master jart@turfwar:~/blink$ type node
node is hashed (/home/jart/vendor/emsdk/node/14.18.2_64bit/bin/node)

I got a little further in the browser. Not sure what to do next.

image

jart avatar Jan 04 '23 19:01 jart

Wow. I'm still catching up. Quick question. Did you have any problems with wait4? My build is complaining about that being undefined.

Yeah, I had to stub that out. FWICT that's a bug in emscripten in some way, as it's available in the system headers and there's also a stub for __syscall_wait4

I made a bunch more changes and I'm now blocked on this error.

Might just be the version of node, it might not support SharedArrayBuffer backed memory for WASM. I'm using v19.3.0 here.

I got a little further in the browser. Not sure what to do next.

That's where the "fun" part starts: Somehow making use of blink.js in the web page. Here's a minimal PoC:

pre.js:

function fileLoad(event, filename) {
    var file = event.target.files[0];
    var reader = new FileReader();
    reader.onloadend = function(event) {
      if(event.target.readyState == FileReader.DONE)
        FS.writeFile(filename, new Uint8Array(event.target.result), {encoding: 'binary'});
    };
    reader.readAsArrayBuffer(file);
}

let fileInput = document.createElement("input");
fileInput.setAttribute("type", "file");
fileInput.onchange = () => { fileLoad(event, "executable"); };
document.body.appendChild(fileInput);

let button = document.createElement("button");
button.innerText = "Start";
button.onclick = () => { Module.callMain(["executable"]); };
document.body.appendChild(button);

Build with emcc o//blink/blink.o o//blink/blink.a -lm -pthread -lrt -o o//blink/blink.html -s INVOKE_RUN=0 -s EXPORTED_RUNTIME_METHODS=callMain -s INITIAL_MEMORY=1073741824 -s EXIT_RUNTIME=1 --emrun --pre-js pre.js

At some point the best option is probably to expose some kind of API to JS, depending on what the actual use cases are.

Vogtinator avatar Jan 04 '23 20:01 Vogtinator

@Vogtinator @jart you guys are amazing. This is great!

unicomp21 avatar Jan 05 '23 11:01 unicomp21

Well done guys! Very Cool!

ghost avatar Jan 05 '23 12:01 ghost

This is awesome, WILL hack with this :D

Rucadi avatar Jan 05 '23 16:01 Rucadi

please someone host their compiled blink.wasm !

pannous avatar Jan 06 '23 13:01 pannous

please someone host their compiled blink.wasm !

Hmmmmm........

ghost avatar Jan 06 '23 14:01 ghost

I pushed a github workflow for emscripten HTML builds: https://github.com/Vogtinator/blink/actions/workflows/emscripten.yml

To build for node instead, uncomment the NODEFS mounting in blink/web.h and build with emcc -O2 o//blink/blink.o o//blink/blink.a -lm -pthread -lrt -o o//blink/blink -s INITIAL_MEMORY=1073741824 -s EXIT_RUNTIME=1 -lnodefs.js.

Vogtinator avatar Jan 06 '23 15:01 Vogtinator

@derekcollison I'm wondering if there could be implications/synergy here for nats? ie tunneling tcp syscalls, etc.?

unicomp21 avatar Jan 06 '23 20:01 unicomp21

@derekcollison I'm wondering if there could be implications/synergy here for nats? ie tunneling tcp syscalls, etc.?

How so? Maybe running a nats-server in the browser?

derekcollison avatar Jan 06 '23 20:01 derekcollison

Yes, or the networking layer for many vm's running in browsers?

unicomp21 avatar Jan 06 '23 20:01 unicomp21

please someone host their compiled blink.wasm !

You might want to try this, based on Vogtinator's workflow.

trungnt2910 avatar Jan 13 '23 11:01 trungnt2910

Tweeted https://mobile.twitter.com/JustineTunney/status/1613895681038770182

@trungnt2910 @Vogtinator Would you both be interested in upstreaming your work? I've just added support for GitHub Actions today. We could add a WASM workflow for example.

One thing I'd especially like to see, is some kind of ANSI code support, so we can render the Blinkenlights TUI in the browser so that people don't need to run it locally to use it.

jart avatar Jan 13 '23 13:01 jart

One thing I'd especially like to see, is some kind of ANSI code support, so we can render the Blinkenlights TUI in the browser so that people don't need to run it locally to use it.

Getting that to work might not be trivial. emscripten can't do any long-running work on the main thread as it blocks the browser, so waiting for input in native code just does not work. Either the TUI would have to be rewritten to work async or it has to run in a web worker and somehow communicate with the main thread for IO.

Vogtinator avatar Jan 13 '23 13:01 Vogtinator