uniffi-rs icon indicating copy to clipboard operation
uniffi-rs copied to clipboard

Support extracting metadata from WASM files

Open paxbun opened this issue 3 months ago • 8 comments

We're trying to add UniFFI WASM support without wasm-bindgen in the next version of Gobley. This PR adds support for extracting metadata from WASM files using the walrus crate of wasm-bindgen, so the next version of UniFFI can generate bindings without building other kinds of cdylib files.

Tested with the following procedure:

  1. Add "wasm-unstable-single-threaded" to the UniFFI dependency of fixtures/coverall/Cargo.toml.
diff --git a/fixtures/coverall/Cargo.toml b/fixtures/coverall/Cargo.toml
index 22fd4beea..cf1719107 100644
--- a/fixtures/coverall/Cargo.toml
+++ b/fixtures/coverall/Cargo.toml
@@ -11,7 +11,10 @@ name = "uniffi_coverall"
 
 [dependencies]
 # Add the "scaffolding-ffi-buffer-fns" feature to make sure things can build correctly
-uniffi = { workspace = true, features=["scaffolding-ffi-buffer-fns"]}
+uniffi = { workspace = true, features = [
+    "scaffolding-ffi-buffer-fns",
+    "wasm-unstable-single-threaded",
+] }
 once_cell = "1.12"
 thiserror = "1.0"

  1. Run the following commands:
# Build a cdylib for the host target and generate bindings in ./out
# Replace aarch64-apple-darwin with the host target
cargo build --target aarch64-apple-darwin -p uniffi-fixture-coverall 
cargo run -p uniffi --features=cli -- generate -l kotlin --out-dir out --library --no-format ./target/aarch64-apple-darwin/debug/libuniffi_coverall.dylib

# Build a WASM module and generate bindings in ./out-wasm
cargo build --target wasm32-unknown-unknown -p uniffi-fixture-coverall 
cargo run -p uniffi --features=cli -- generate -l kotlin --out-dir out-wasm --library --no-format ./target/wasm32-unknown-unknown/debug/uniffi_coverall.wasm
  1. Compare the contents.
diff -r out out-wasm
# exit code: 0

paxbun avatar Sep 06 '25 07:09 paxbun

We have a WASM example here that demonstrates WASM transformations required to implement the FFI logic of UniFFI: https://github.com/gobley/gobley/blob/536fd52d1209c1ab09da54187db2484297e071b4/tests/gradle/js-only/src/jsMain/kotlin/RustLibrary.kt

paxbun avatar Sep 06 '25 07:09 paxbun

I support the idea, but this new dependency is likely to be a problem for us - even without the new crate duplicates this introduces. I wonder if we can work out how to sanely push this out to the consumer (ie, so crates can reasonably have their own uniffi_bindgen with this capability)?

mhammond avatar Sep 07 '25 22:09 mhammond

this is somewhat similar to https://github.com/jrmuizel/uniffi-rust-parse (unpolished, helped by cursor via an unrelated proejct), which is an experiment using Rust Analyzer to create metadata and avoid using a .so - but it's a similarly shaped problem. All metadata collection should maybe be shifted? Not sure how or really what I mean though :)

mhammond avatar Sep 08 '25 00:09 mhammond

I support the idea, but this new dependency is likely to be a problem for us - even without the new crate duplicates this introduces. I wonder if we can work out how to sanely push this out to the consumer (ie, so crates can reasonably have their own uniffi_bindgen with this capability)?

My general feeling is that external bindings might want to move away from functions like generate_bindings() that input a BindingGenerator impl and drive the entire process with it. Instead, they can roll their own top-level function to does the driving. See uniffi-bindgen-gecko-js for an example. That one uses the new pipeline system, but I think it should also work with a bindings generate based on ComponentInterface too. Maybe we'd need to define and/or expose some utility functions to help out, which I'd be happy to help with.

One advantage of that is that it's easy to alter the process. You could replace the call to macro_metadata::extract_from_library(), with your own call that extracts the data from wasm modules. I also think that it's simpler to think about.

Would something like that work for Gobley?

bendk avatar Sep 08 '25 14:09 bendk

@bendk Thanks for the comment! For now, we only need to alter the metadata parsing logic, so a helper function similar to uniffi_bindgen::library_mode::generate_bindings with an option to customize that would be helpful. Since we have our own cdylib name setting logic and don't check the file name of the library file as in uniffi_bindgen::gen_library_mode, customizing calc_cdylib_name might not be needed.

Speaking of customizing the entire process that extracts metadata and builds CIs, I'm investigating name obfuscation features (for Android/WASM, using llvm-objcopy/walrus), so I expect we'll eventually end up with maintaining our own CI-building logic in the future when we're ready to implement those features.

paxbun avatar Sep 08 '25 17:09 paxbun

@bendk Thanks for the comment! For now, we only need to alter the metadata parsing logic, so a helper function similar to uniffi_bindgen::library_mode::generate_bindings with an option to customize that would be helpful.

I think that would be possible, we could add something like specialized_extract_metadata(file_data: &[u8]) -> anyhow::Result<Option<Vec<Metadata>>>. If that returns Ok(None) that means the bindings weren't able to do specialized metadata extraction and we should fall back to using extract_from_bytes() like normal. However, my general feeling is that's not so simple to understand and the whole inversion-of-control pattern was a mistake there.

If you had to write and maintain your own generate_bindings() function how hard would it be? Probably we'd add some new utility functions so you wouldn't need to maintain a blocks of code like this. What's left would a function that you own that makes calls into our functions, rather than the other way around. I feel like it would be simpler that way, but I'd love to hear your perspective on that.

bendk avatar Sep 08 '25 19:09 bendk

It would be trivial if the code range that needs rewriting and and maintaining is confined to uniffi_bindgen/src/library_mode.rs. Would I need to maintain our own uniffi_bindgen/src/lib.rs (for uniffi_bindgen::generate_external_bindings for "non-library" mode)? Even in that case, I expect it will take 3-4 hours at most to make external bindgens reflect the changes in this repo. I've just had a quick check of visibilities of entities used in these two files, and it seems we can start maintaing our own one even now.

paxbun avatar Sep 09 '25 07:09 paxbun

It would be trivial if the code range that needs rewriting and and maintaining is confined to uniffi_bindgen/src/library_mode.rs. Would I need to maintain our own uniffi_bindgen/src/lib.rs (for uniffi_bindgen::generate_external_bindings for "non-library" mode)? Even in that case, I expect it will take 3-4 hours at most to make external bindgens reflect the changes in this repo. I've just had a quick check of visibilities of entities used in these two files, and it seems we can start maintaing our own one even now.

I don't think it should be hard and I also remembered that we have a uniffi-bindgen-swift that uses this strategy. I think we could update it to handle non-library mode by changing this to an if with a second branch that uses parse_udl. I'm going to try to update that code to support non-library mode and see how hard it is.

bendk avatar Sep 09 '25 13:09 bendk