lucet icon indicating copy to clipboard operation
lucet copied to clipboard

Host calls exposed via wasm function indirection tables are not re-entrant

Open shravanrn opened this issue 4 years ago • 6 comments

Context This is part of a series of bugs that I spoke to @tyler @pchickey about. We are currently using Lucet to sandbox libraries in C++ applications. The idea behind this is that using a wasm sandboxed version of the library allows ensuring that a memory safety issue in the library does not automatically result in a memory safety vulnerability in the full application. One of the consumers of this work is the Firefox web browser.

Problem Currently the hostcalls exposed via wasm function tables are not re-entrant. By re-entrancy, I mean the following scenario - I have a Wasm function that calls into the host which further calls another wasm function before returning.

For the library sandboxing use case, we use the function indirection table to make this work. There are some ABI translations involved, but this works well. However, if this host call is re-entrant (tries to invoke another WASM function via the Lucet APIs), the Lucet runtime raises an error. This is because the runtime believes we are already executing WASM code.

The reason we are using the function indirection table for host calls is that, we cannot use the same approach as is used in the Lucet code base. I believe lucet specifies all host calls at compile time in a "bindings.json" file. However, in our case, the set of allowed host calls

  1. Can change dynamically
  2. Are not static calls, but are indirect calls (invoked via a function pointer)

One of the workarounds suggested was to have the re-entrant call use Vmctx::get_func_from_idx, which returns a FunctionHandle, that we can then change to a function pointer. I'm still a little confused if this can work... IIUC, this only exposes indexes of functions which are part of the WASM function indirection table. But this means, the re-entrant call cannot invoke a chosen public function exposed by the WASM module... Instead it can only invoke from the subset of functions that happen to be available in the function indirection table.

Actions

I think the best approach here is to support re-entrancy as a part of run_func. @pchickey mentioned that there are some plans for this, which is fantastic. If it helps, I have implemented a patch to make the Lucet APIs re-entrant in my fork of Lucet (See here). Also, if there is something I can help with re this bug, please let me know.

shravanrn avatar Sep 05 '19 02:09 shravanrn

If we added a method to Vmctx such as:

fn get_export_func(&self, sym: &str) -> Result<FunctionHandle, Error>;

Would that solve the issue? There's no fundamental reason it's not there, we just haven't had the need for it ourselves.

While we have been thinking about adding a run_func variant for Vmctx, we are not planning to make run_func reentrant. Most of the work done there is simply not necessary for calling back into guest code from a hostcall; we don't need to set up a new stack, install signal handlers, etc.

acfoltzer avatar Sep 05 '19 20:09 acfoltzer

Short term, something similar to that would possibly get us close to handling this... It seems what we would need is both a public version of get_export_func as well as a way to call the returned handle - maybe something like a run_export_func that could do the wasm faulted state checks, argument passing etc. but can elide the signal handler setup etc.

However, I guess the longer term answer depends on what must happen in the host calls

  • Signal handlers - if the signal handler must be disabled during host call invocation, then even the re-entrant version would have to reset signal handlers. It is not yet fully clear to me what the correct decision would be re keeping/removing signal handlers during hostcall invocation either for the general or library sandboxing use case... I will be looking into this (in the context of library sandboxing) and can post any of my findings in case it is helpful
  • Stack switch - If hostcalls have to do a stack switch either to the embedding process stack or per discussions on #118 , on to a new stack altogether, stack setup is probably something that needs to happen on re-entant calls as well?
  • Argument passing - I guess whether a re-entrant version of run_func (and for that matter run_func) handles the argument passing depends on whether it is OK for the application/embedding process to rely on the ABI of the wasm function (i.e. the caller can assume the heap is the first parameter followed by WASM arguments following the calling convention of the platform).

Any thoughts on how to proceed welcome!

shravanrn avatar Sep 06 '19 16:09 shravanrn

maybe something like a run_export_func that could do the wasm faulted state checks, argument passing etc. but can elide the signal handler setup etc.

If it's a method on Vmctx, we could elide the state checks as well, since we should always be in a Running state at that point.

  • Signal handlers - if the signal handler must be disabled during host call invocation, then even the re-entrant version would have to reset signal handlers. It is not yet fully clear to me what the correct decision would be re keeping/removing signal handlers during hostcall invocation either for the general or library sandboxing use case... I will be looking into this (in the context of library sandboxing) and can post any of my findings in case it is helpful

We would definitely appreciate any feedback you have on the signal handling mechanism, but here's an overview of how it works currently:

The Lucet signal handlers are installed whenever any Lucet instance is currently running in the process, because at least with POSIX we can only install one handler per-signal per-process.

The handler implementation is what we use to distinguish signals from guest code trapping vs. signals from something else going wrong in a hostcall, by checking whether the program counter is within the loaded .so.

The upshot for this issue is that we just shouldn't have to worry about installing or uninstalling handlers when we call from the guest into a hostcall, or from a hostcall into a guest; they should only be modified when entering or leaving the closure passed to Instance::with_signals_on().

  • Stack switch - If hostcalls have to do a stack switch either to the embedding process stack or per discussions on #118 , on to a new stack altogether, stack setup is probably something that needs to happen on re-entant calls as well?

For the cases where we switch back to the embedding/host stack, we don't have to worry about reentrancy because the previous call to run_func will have returned. Regarding #118, we're currently leaning toward a different solution that would allow guest code and hostcalls to continue coexisting on the same stack, so this will probably be a moot concern.

  • Argument passing - I guess whether a re-entrant version of run_func (and for that matter run_func) handles the argument passing depends on whether it is OK for the application/embedding process to rely on the ABI of the wasm function (i.e. the caller can assume the heap is the first parameter followed by WASM arguments following the calling convention of the platform).

I think the way to handle this is to add a Vmctx::run_func() that will do the right things ABI-wise. It would reintroduce the need to wrap arguments in Val, though, so we could continue to expose the function pointer to unsafely call if you know how to make the ABI happy. Right now, that's not too complicated because it's just the heap pointer for the first argument, but we're considering adding other pinned registers, like for counting instructions.

acfoltzer avatar Sep 06 '19 22:09 acfoltzer

Ah gotcha, I think I understand the context now. With the above considerations, it sounds like get_export_func and run_export_func (Vmctx::run_func) would be the right fit. In the interest of next steps, could you let me know if this is something I should contribute a patch for or if this is something that is already on your radar?

shravanrn avatar Sep 08 '19 00:09 shravanrn

I believe I'm hitting a similar wall here: Invalid state: Instance marked as in a hostcall while entering a hostcall.

Full panic:

thread 'main' panicked at 'Invalid state: Instance marked as in a hostcall while entering a hostcall.', lucet/lucet-runtime/lucet-runtime-internals/src/instance/execution.rs:394:17
stack backtrace:
   0: std::panicking::begin_panic
             at $HOME/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:519:12
   1: lucet_runtime_internals::instance::execution::KillState::begin_hostcall
             at lucet/lucet-runtime/lucet-runtime-internals/src/instance/execution.rs:394:17
   2: lucet_runtime_internals::instance::Instance::uninterruptable
             at lucet/lucet-runtime/lucet-runtime-internals/src/instance.rs:910:9
   3: lucet_vmctx_grow_memory
             at lucet/lucet-runtime/src/c_api.rs:441:1
   4: sbrk_1419
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
fatal runtime error: failed to initiate panic, error 5

The hostcall invokes wasm code via vmctx.get_func_from_idx/transmute just fine. But then it panics if wasm subsequently invokes another hostcall. In this case sbrk to grow the heap.

Is there another way of doing this?

juancampa avatar Feb 27 '21 16:02 juancampa

There is no way around this restriction at the moment, and we are not planning to work on one. Lucet is in the process of transitioning all of its unique features into the Wasmtime project, and we recommend users transition to Wasmtime.

pchickey avatar Mar 02 '21 17:03 pchickey