wasmex icon indicating copy to clipboard operation
wasmex copied to clipboard

Underlying Wasm runtime, wasmer -> wasmtime

Open brooksmtownsend opened this issue 2 years ago • 11 comments

Slightly related to #282

So wasmer is the underlying runtime at the moment for wasmex, would you consider switching the underlying runtime of this project to wasmtime like https://github.com/viniarck/wasmtime-ex has?

We've loved using this project and are looking to take advantage of the features that come first to wasmtime, and in a perfect world this would be done through swappable runtimes like #282. In the shorter term, is there any reservation to changing the runtime?

Thanks @tessi 🚀

brooksmtownsend avatar Jun 09 '22 20:06 brooksmtownsend

Thanks for asking! 💜 I would love to have wasmtime as the first other engine. I even have a local (currently not working) fork porting wasmex to wasmtime - I think having a fork is the first step to figure out the differences we need to be aware of when implementing #282

The only problem is my limited free time. Do you have the time to take over and/or help here?

tessi avatar Jun 10 '22 06:06 tessi

@tessi would love to help out here, especially if you're close! Would you be willing to publish that fork and give a quick TL;DR on what you've tried and what's not working?

brooksmtownsend avatar Jun 13 '22 13:06 brooksmtownsend

Hey @brooksmtownsend 👋 sorry for letting you hang so long. But I have mildly good news for a wasmtime fork of wasmex. I used some of my vacation time to get my fork working enough to make it public. Please find it here.

It turned out to be a way bigger refactoring than anticipated and, unfortunately, it is not fully working at the moment. This is where I could really use some of your help :)

  • the biggest issue right now is a deadlock where I am not sure how to solve it best. The reason is that wasmtime needs a store (a place that hold memory for WASM instances/modules) for almost all API calls. I added elixir APIs to create new stores so that we can pass down the store to all wasm calls. The problem is that we need to place the store (on the rust side) behind a mutex to make it thread safe. Now, when calling a WASM function we hold a mutex-lock on the store. When that WASM function call requires a callback (an elixir func being executed), that elixir function callback needs the same store for many API calls - e.g. manipulating WASM memory. But these API calls fail since we cannot unlock the mutex again. It's hard to explain, but there is a test that runs this scenario and results in the described deadlock. If you could solve that (and I don't know how exactly yet) that would be huge
  • I haven't tested (or fixed tests for) WASI support yet. The same probably applies for other areas (std in/out/err, pipes).
  • docs are blindly copied from wasmex and are half-wrong after the refactoring

tessi avatar Jul 27 '22 21:07 tessi

Regarding the store-deadlock problem, I just realized that our wrapped functions receive a Caller which happens to implement AsContextMut/AsContext and can act like a Store when called within the function scope.

I think we need to find a way to extract this caller into the elixir fn callback_context that we create here.If we have that done, we could pass the caller (acting as a store) instead of the original deadlocked store. I stumbled across lifetime issues when attempting that, but maybe you can figure out a way to forward the Caller to Elixir somehow?

tessi avatar Aug 04 '22 07:08 tessi

Maybe we can get inspiration from wasmtime-py, they interface the C API, but they do exactly what I proposed above. They take the caller and pass them as a reference to the python function. I guess they can get around lifetime issues since C doesn't have lifetimes :D but maybe there is a (unsafe?) Rust way to do similar.

tessi avatar Aug 04 '22 07:08 tessi

Thank you for the great updates @tessi, I'm following along here but haven't had much free time to look into it yet

brooksmtownsend avatar Aug 11 '22 15:08 brooksmtownsend

no worries. I'm in the same boat :)

tessi avatar Aug 12 '22 13:08 tessi

I think i can report progress. I just pushed another (still very broken) version of wasmex-wasmtime which introduces a new concept

  • during function execution we get the caller struct passed from wasmtime, that caller is only valid during function execution and can not safely be passed into elixir land because of lifetimes
  • I introduced a "caller registry" where a caller reference (using an unsafe pointer hack) can be stored in exchange for a "token". That token can be passed up into elixir land. We wrap that token into a StoreOrCaller struct because stores and callers can be passed interchangeably into many functions (e.g. Memory.read).
  • from elixir you can now use that StoreOrCaller within a function call to read/write memory (or do other things where you'd normally need a store) without running into deadlocks.

So far for the theory :D In practice, when doing the above, wasmtime panics because it thinks we're using a different store somehow and I don't know why yet.

I added some debug code here which illustrates the problem and pointer-hack, but simplifies it as much as possible.

Within a wrapped function, I can read memory using the originally provided caller:

let mut buffer = [0];
memory.read(&caller, 0, &mut buffer).unwrap();
println!("buffer 1: {:?}", buffer);

however, after doing that pointer hack - which should give us the same caller because we read from the same memory address - we get a panic:

let raw_ptr: *mut Caller<StoreData> = &mut caller as *mut Caller<StoreData>;
let the_other_caller: &mut Caller<StoreData> = unsafe { &mut *raw_ptr as &mut Caller<StoreData> };

memory.read(the_other_caller, 0, &mut buffer).unwrap();
println!("buffer 2: {:?}", buffer);

// thread '<unnamed>' panicked at 'object used with the wrong store',
// /Users/tessi/.cargo/registry/src/github.com-1ecc6299db9ec823/wasmtime-0.39.1/src/store/data.rs:258:5

you can play with that scenario, when checking out main and running RUST_BACKTRACE=1 mix test test/wasmex_wasmtime_test.exs:125

I hope someone reading this can help me find out why wasmtime thinks this is a new/different store and maybe even fix the problem 🙏

tessi avatar Aug 31 '22 08:08 tessi

Good news! together with @hamptokr we could fix the memory problem and reading/writing memory from within function calls now fully works. In fact, all tests - with the exception of WASI tests as I didn't look at WASI yet - work \o/

Next steps: Fixing WASI and cleaning things up

tessi avatar Sep 01 '22 19:09 tessi

Keeping you updated here: WASI seems to work roughly - I still have to fix file access permissions, but 5/10 wasi tests work. This is getting closer :)

tessi avatar Sep 05 '22 19:09 tessi

Perfect timing here, Wasmtime just announced 1.0! Very exciting 😄

brooksmtownsend avatar Sep 21 '22 18:09 brooksmtownsend

super exciting indeed! :)

I worked some more on the wasmtime fork and all test are green now 🎉

Screenshot 2022-10-06 at 18 55 13

However, it works only on my wasmtime fork. Next task is to contribute my wasmtime changes back so we can have proper access permissions for preopened directories. I'm currently asking the wasmtime folks how to contribute my patch back here.

tessi avatar Oct 06 '22 16:10 tessi

However, @brooksmtownsend it would be great if you could give that fork a try and see if it works for you.

Some things (especially the WASI API) changed, but I hope the changes are easy to spot reading the source. But if not, please, ask :) Writing docs and a migration guide is another open TODO.

tessi avatar Oct 06 '22 16:10 tessi

Next task is to contribute my wasmtime changes back so we can have proper access permissions for preopened directories.

Turns out that the wasmtime folks recommended to not have access permissions for WASI preopens yet but rather wait it to be implemented "proper" in the future (ticket).

So I removed use of my custom wasmtime fork and instead upgraded to wasmtime 4.0. Any preopened directory will come with full access permissions until wasmtime allows us to restrict access.

In addition, I moved all WASI opts (but especially the directory preopen options) into Elixir structs. This makes it easier to upgrade them in the future because we will get proper compile-time errors whith missing or changed keys (e.g. for those future access permissions). This probably even improves usability.

tessi avatar Dec 30 '22 00:12 tessi

@brooksmtownsend did you have the time to try things out yet? No worries if not :)

I'll soon merge my work back into this repo (in some dev-branch, I guess) to prepare docs, polish, and release the new version

tessi avatar Dec 30 '22 00:12 tessi

The MR that brings the changes back is here and looks good. Docs are updated, changelog is written.

tessi avatar Jan 02 '23 09:01 tessi

Alright, merged the MR 🚢

tessi avatar Jan 03 '23 09:01 tessi