lucet icon indicating copy to clipboard operation
lucet copied to clipboard

Stack unwinding improvements

Open acfoltzer opened this issue 4 years ago • 8 comments

The main problem: resource leaks

If a guest terminates due to a signal or an explicit termination call rather than returning normally via lucet_context_backstop, the stack still needs to be unwound, otherwise resources such as RAII memory or locks can leak. Specifically, the "exception handling personality" functions associated with each stack frame must be called, which can only be done (I think) by the system unwinding runtime, such as libgcc (see section 6.2 of the AMD64 ABI).

This is currently only an issue for hostcall code in RAII style languages like Rust or C++, but will also be an issue for guest code once we support wasm exceptions. Handling resource leaks in hostcalls implemented in C is beyond the scope of this PR, as that will likely require in-band error signaling rather than relying on out-of-band exceptions.

A subproblem: debugging info

Runtime support for stack unwinding overlaps quite a lot with support for backtraces and debuggers, so a solution to the main unwinding problem should entail better backtraces and debugger navigation.

Undefined behavior

The approaches described below involve unwinding between guest code emitted by Cranelift, and hostcall code which is primarily (for now) written in Rust. Sayeth the Nomicon:

Rust's unwinding strategy is not specified to be fundamentally compatible with any other language's unwinding. As such, unwinding into Rust from another language, or unwinding into another language from Rust is Undefined Behavior. You must absolutely catch any panics at the FFI boundary! What you do at that point is up to you, but something must be done. If you fail to do this, at best, your application will crash and burn. At worst, your application won't crash and burn, and will proceed with completely clobbered state.

We need to study why this is undefined behavior, because I haven't yet found a fundamental reason for that to be the case, at least on x86-64 Linux.

Leading approach: unwinding smoothly from guest to host context

(As of Sept 6, 2019, at least)

Between the current contents of this branch and https://github.com/CraneStation/cranelift/pull/902, we now have the capability to unwind from a guest stack all the way back to the host stack that called the initial entrypoint.

The remaining bit of implementation is to induce unwinding when resetting or dropping an instance, so that we can clean up the stack left behind by a signal handler or a suspended instance that never resumes. The main complication for this is if the guest stack has overflowed (triggering a signal) or has almost overflowed; we'll need to be able to unprotect the guard page(s) in this case so that the unwind invocation doesn't fault.

Other approaches

We're still investigating which approach is actually the right one to use, but it's probably not the current approach.

Pervasive catching in hostcalls (current approach, possibly doomed)

Implemented in #157, this largely punts on the issue of unwinding and makes each hostcall responsible for catching exceptions and switching contexts to the host before the exception bubbles up through guest segments of the stack. The native unwinding mechanism of the hostcall implementation language is responsible for calling the personality functions.

The nested hostcall problem

This works reasonably well for the case where there's at most one hostcall in the stack, but as that PR notes:

...this only works for the case where we are in at most one hostcall context. If we have a stack with guest -> host -> guest -> host, the first host segment will not be properly unwound.

Unfortunately, this pattern is more common than it might seem at first, due to how we typically interact with guest memory managers in order to reserve space for host data to be copied into linear memory.

To use Terrarium as an example, when a guest asks for the headers of a request, the req_get_headers hostcall invokes the guest's malloc to reserve space in linear memory for copying in the header strings. The guest malloc in turn could invoke the host's grow_memory implementation, leaving us with a stack that has two guest segments and two hostcall segments: run (guest) -> req_get_headers (host) -> malloc (guest) -> grow_memory (host). If grow_memory were to try terminating the instance, any resources acquired by req_get_headers would leak.

We could probably get around this by keeping track of how many hostcalls we're in, and reraising the exception until it unwinds all of the hostcall segments, however this does not help at all with...

The signal handling problem

This approach only addresses panics that arise from hostcall code (specifically the panics we use to implement guest termination). It does not address at all what happens to the stack after running a signal handler, which can happen completely in guest code.

Suppose we have a guest function div0 that divides an integer by zero, and a stack like run (guest) -> hostcall_foo (host) -> div0 (guest). The SIGFPE raised by div0 will be handled by the Lucet signal handler, which will then use Context::set_from_signal() to clear the signal mask and swap back to the host context. Since no panic/exception is ever raised, there's not even an opportunity for the hostcall_foo segment of the stack to be unwound.

We definitely don't want to panic directly from a signal handler. Setting up the return from the signal handler to a landing pad that promptly panics might work, but seems fragile.

Unwinding smoothly from guest to host context (nice but not sufficient)

Edit: this has been promoted to the leading approach.

Why don't we put the catch_unwind in Instance::run(), rather than at the hostcall entrypoints? Because right now, the unwinding runtime bottoms out at lucet_context_backstop rather than following the call chain back to the host context. It looks like it should be possible using CFI directives to convince the runtime to follow the call chain back to the host context's saved stack pointer, but I haven't had any luck so far.

Getting this working would make it easier to deal with the panics we use to implement instance termination, and would improve debugger backtraces. It wouldn't address the signal handler case.

Manual unwinding from the host context (probably doomed)

libunwind lets us capture a context, and later step through its stack frames. We can even get the address of the personality function from unw_get_proc_info()! The hope was that with this information, we could manually unwind guest stacks after switching back to host code by stepping through the frames and invoking each personality function directly.

However, personality functions for x86-64 Linux are implemented in terms of functions provided by the unwinding runtime and described in section 6.2.5 of the AMD64 ABI. These functions all take an opaque pointer to an _Unwind_Context struct that's provided to the personality function by the unwinding runtime when it is unwinding in response to an exception. This appears to be the only way to get a valid _Unwind_Context pointer, and so we cannot get our hands on valid arguments for the personality function; they can only be provided by the system unwinding runtime.

Resuming a terminated guest at a function that panics

So, if we can't invoke the personality functions ourselves, we need to figure out a way to invoke the unwinding runtime while in the context of the guest we wish to unwind. We might be able to retain the guest Context after termination, and then swap into it with bootstrapping code that would land us in a function that panics or calls _Unwind_RaiseException() directly.

This would probably require adding some new interfaces to the Context API. Specifically, we do not currently have an interface for swapping to an existing guest context, but at a new function entrypoint. Instead, the entrypoint is set once during Context::init(), and the empty stack is carefully arranged to return into the entrypoint from lucet_context_bootstrap.

We'll need to do something like pushing an additional frame that would return to the panicking function, or pushing a synthetic frame and setting the instruction pointer to the panicking function directly. Since the panicking function won't need to take any arguments, we can probably avoid a lot of the complications that the init() interface has. This interface could also be useful for implementing a yield operator in the future.

We also would need to figure out how the unwinding would bottom out. Preferably, we could figure out "Unwinding smoothly from guest to host context", but an alternative would be to add a custom exception handler personality function to lucet_context_backstop. This function would use the _Unwind_* APIs to specify a landing pad back in the host.

Use panic/catch_unwind in hostcalls, and error propagation in guest code

Given the specter of undefined behavior mentioned above, we should also be considering solutions that do not rely on exceptions ever crossing the boundary between host and guest code. The current solution does exactly this, albeit with the limitations of a single host segment, and not addressing signal handlers.

Strawman: make :clap: guest :clap: funcs :clap: optional

Perhaps we could add the moral equivalent of Option<T> to the return type of each guest function. Or, for the Haskellers in the crowd, we could lift all guest code into the Maybe monad.

If a panic arises in a hostcall, we would have the existing catch_unwind, but rather than swapping back to the host context upon termination, it would return the moral equivalent of None to the caller.

Then in guest code, any time a call results in a None, we promptly return another None. The None would propagate until we reach host context, which would be doing the moral equivalent of .unwrap() on all calls to guest funcs.

This would propagate the panic between segments of host stack without actually relying on the unwinding runtime to cross language boundaries.

I'm not sure what the signal handling case would look like here, but perhaps we could arrange to return from the handler into the previous stack frame with a None?

acfoltzer avatar Jul 18 '19 01:07 acfoltzer

I noticed the commit adding eh_frame support. FYI, gimli supports writing eh_frame. Might be a bit of a larger dependency if you only need something minimal, just something to consider, but I think cranelift will eventually add this as a dependency for debuginfo too.

philipc avatar Aug 07 '19 04:08 philipc

oh that's super cool! last I'd looked gimli only knew how to read dwarf, not write :D the bulk of that commit is bumping cranelift to https://github.com/CraneStation/cranelift/commit/3f74458a4520b41e12823a373c743b9023d31cc3 which actually does nontrivial eh_frame stuff. It looks like gimli does all the things there, but better.

I only saw https://github.com/CraneStation/cranelift/issues/426 when looking for eh_frame-related conversation in cranelift - is there any reason I shouldn't rewrite my changes in terms of gimli and put up a cranelift PR soonish?

awortman-fastly avatar Aug 07 '19 04:08 awortman-fastly

There's some frame layout work in https://github.com/CraneStation/cranelift/pull/679, and https://github.com/CraneStation/wasmtime/pull/53. I'm not sure how eh_frame support in cranelift-faerie will tie in with that.

philipc avatar Aug 07 '19 04:08 philipc

:broom: :sparkles: I've rebased this branch and fixed a few instances of bitrot between then and now - the new unwinding tests are now run under MmapRegion and UffdRegion like any other tests.

The tests here will not pass as things stand: the only diff for this branch between "broken' and "working" is adding a features = ["unwind"] declaration to our cranelift-object dependency. That feature doesn't exist yet, but :crossed_fingers: the PR I'm putting together to add such a feature will go smoothly and it'll exist soonly. With the prototyping I've done, everything seems to be working, except some UffdRegion-based tests, which likely fail due to the very naively implemented *_redzone_stack functions there.

A few more details and updates from the OP:

Leading approach: unwinding smoothly from guest to host context

(As of Sept 6, 2019, at least)

This is almost the approach taken in this branch, though not as directly. We actually do...

Resuming a terminated guest at a function that panics

to initiate an unwind, and allow that to go up the guest stack to the host.

On the topic of mixed guest/host code:

Unfortunately, this pattern is more common than it might seem at first, due to how we typically interact with guest memory managers

We have, so far, successfully designed around this problem. As noted above,

We could probably get around this by keeping track of how many hostcalls we're in, and reraising the exception until it unwinds all of the hostcall segments,

which would be necessary in some form for timeouts to work correctly given guest callbacks, so this continues to be a plausible approach.

The signal handling problem

I don't think this is a problem anymore!

Lucet signal handler, which will then use Context::set_from_signal() to clear the signal mask and swap back to the host context.

We don't quite do this anymore, but the behavior now is equivalent.

Since no panic/exception is ever raised, there's not even an opportunity for the hostcall_foo segment of the stack to be unwound.

This circumstance should result in a State::Faulted of the guest, where the runtime can then figure out it should unwind from the inner guest, through hostcall frames, and up the rest of the stack.

Because right now, the unwinding runtime bottoms out at lucet_context_backstop rather than following the call chain back to the host context. It looks like it should be possible using CFI directives to convince the runtime to follow the call chain back to the host context's saved stack pointer, but I haven't had any luck so far.

@acfoltzer I think you have in fact had success doing this? moving the catch_unwind all the way out to Instance::run doesn't seem unreasonable now.

We'll need to do something like pushing an additional frame that would return to the panicking function, or pushing a synthetic frame and setting the instruction pointer to the panicking function directly.

And this is exactly how force_unwind works. :)

iximeow avatar Jun 19 '20 22:06 iximeow

Assigned myself just so it's somewhat harder for me to lose track of this. As the originator of the PR I can't mark myself as a reviewer :disappointed:

acfoltzer avatar Jun 25 '20 22:06 acfoltzer

This PR was closed as a byproduct of deleting the branch named master. If this is still an active PR, re-open as a new PR against main.

pchickey avatar Jun 26 '20 00:06 pchickey

🎂! totally forgot to celebrate this one 💔

iximeow avatar Aug 25 '20 22:08 iximeow

:birthday: :birthday: :tada:

iximeow avatar Jul 19 '21 06:07 iximeow