lucet icon indicating copy to clipboard operation
lucet copied to clipboard

Allowing compilation of a static library in lucet

Open shravanrn opened this issue 4 years ago • 6 comments

Context This is part of a series of bugs that I spoke to @tyler @pchickey about. We are currently using Lucet to sandbox libraries in C++ applications. The idea behind this is that using a wasm sandboxed version of the library allows ensuring that a memory safety issue in the library does not automatically result in a memory safety vulnerability in the full application. One of the consumers of this work is the Firefox web browser.

Problem Currently lucet applications are compiled as shared object files that must have a main function. However, these restrictions make static linking of libraries very challenging. I thus wanted to start a discussion about how to allow static linking of binaries produces by lucet.

The most obvious changes that come to mind are

  • Removing the requirement of having a main function
  • Symbol name collisions - when linking multiple binaries generated via lucet, both modules would have a lucet_module_data symbol which would collide.

Actions Any thoughts on the feasibility of this request or whether would be fantastic. Additionally, if we can land on a design, I would be happy to contribute a patch implementing the same.

shravanrn avatar Sep 05 '19 02:09 shravanrn

  • Removing the requirement of having a main function

I don't believe this is currently a requirement, except that lucet-wasi defaults to running _start. You can use lucet-wasi --entrypoint foo to pick a different entrypoint.

  • Symbol name collisions - when linking multiple binaries generated via lucet, both modules would have a lucet_module_data symbol which would collide.

Yeah, we definitely would need to configure lucetc to emit different symbol names for these. There also will probably be some issues with the relocations for initial table values; @awortman-fastly knows much more about that than I do, though.

acfoltzer avatar Sep 05 '19 21:09 acfoltzer

Symbol name collisions - when linking multiple binaries generated via lucet, both modules would have a lucet_module_data symbol which would collide.

Ooh, statically linking multiple lucet modules! That's a cool problem :)

Yeah, we definitely would need to configure lucetc to emit different symbol names for these.

I think something like a user-supplied prefix (or use the name of the input .wasm) to namespace the symbols we add would work well. That should be pretty easy to teach lucetc. I think the harder part would be indicating a non-default symbol for the module's metadata to anything runtime-side - which table to look into to resolve exports, things like that.

awortman-fastly avatar Sep 05 '19 21:09 awortman-fastly

I don't believe this is currently a requirement, except that lucet-wasi defaults to running _start. You can use lucet-wasi --entrypoint foo to pick a different entrypoint.

Oh good to know. It feels a bit weird to pick an entry point for a library, but I guess I can just pick one of the existing library functions at random without repercussions?

I think something like a user-supplied prefix (or use the name of the input .wasm) to namespace the symbols we add would work well. That should be pretty easy to teach lucetc.

yay! :smiley:

I think the harder part would be indicating a non-default symbol for the module's metadata to anything runtime-side - which table to look into to resolve exports, things like that.

Umm, sorry I think I didn't follow this completely... I followed that the runtime has to be aware of the modified symbols i.e. user_prefix_lucet_module_data, user_prefix_guest_table_0, user_prefix_guest_table_0_len (I've been playing with the version released on crates, so these names may be dated)

This means there probably needs to be an extra API/some API modifications to allow the embedder to specify this prefix during module load.

Was this what you meant or are there additional concerns/considerations apart from this?

shravanrn avatar Sep 06 '19 17:09 shravanrn

Oh good to know. It feels a bit weird to pick an entry point for a library, but I guess I can just pick one of the existing library functions at random without repercussions?

Well, the lucet-wasi binary (as opposed to library) is meant for running executables. It just happens to also work for any library functions that take no arguments, if you wanted to specify a different entrypoint.

acfoltzer avatar Sep 06 '19 23:09 acfoltzer

@awortman-fastly @acfoltzer - Just following up on this. Is the ability to link multiple lucet modules something you could reasonably consider? If so, is this something that would be handled internally or is this something I should try to contribute a patch for? In the interest of next steps, should I try to implement the user supplied prefix for symbols?

@awortman-fastly - Also just a gentle reminder about the clarification question above? :smiley:

shravanrn avatar Sep 15 '19 20:09 shravanrn

Oops, yes. Regarding user-supplied prefixes, you understood me entirely. The only symbols you should have to change are where we use LUCET_MODULE_SYM in lucetc and in lucet-runtime. Externally visible function names would need the user-supplied prefix prepended too, and local functions should get the same treatment for consistency.

I'm not too sure how symbols other than those will pan out, because I think our other symbols at this point are all local and shouldn't need to be changed. So if that all works, great! If not, hopefully that still gets you pretty close. The symbol tweaking will conflict a bit with #295, but shouldn't be hard to resolve - just a question of who gets which changes in first.

I can't think of a reason we would want to not support this, but I also don't think I'll be able to get around to this soon, so I'd be happy to review a patch :D

awortman-fastly avatar Sep 16 '19 16:09 awortman-fastly