watt icon indicating copy to clipboard operation
watt copied to clipboard

Any chance watt can be upstreamed to cargo?

Open NobodyXu opened this issue 3 years ago • 8 comments

^

NobodyXu avatar Oct 19 '22 05:10 NobodyXu

P.S. I think compiling proc-macro crates to wasm also need interoperable_api for the following reasons:

I think this would be useful for compiling proc-macro crates to wasm.

Proc-macro crates usually depend on proc-macro2, syn and other crates.

Currently, watt uses static linking for proc-macro crates, compile them down to wasm and upload them to crates.io

This could seriously be a problem, since:

  • users of the proc-macro crate cannot upgrade the transitive dependencies even if there is a new patch release
  • users cannot use [patch] to replace the dependencies
  • since the whole statically linked wasm binary is uploaded to crates.io, it will take significantly more space than proc-macro crates without wasm on crates.io and on the users' computers

As such, I think this interop ABI is needed for compiling proc-macro crates to wasm to work.

NobodyXu avatar Jan 21 '23 12:01 NobodyXu

I don't think that will help. If watt changes it's internal API, this will break the ABI with or without interoperable_api. If watt doesn't change it's internal API, extern "C" + #[repr(C)] is enough to keep the ABI stable.

bjorn3 avatar Jan 21 '23 14:01 bjorn3

There have been discussions about supporting running proc macros in wasm natively. Offering precompiled wasm binaries is technically a separate problem from supporting a wasm proc_macro bridge.

For the wasm driver, there's already a "correct" solution. We can use the (unstable implementation detail) RPC currently used directly, or we can use externref for handles to server resources for a "proper" wasm interface, at the cost of some overhead. (Rust code on wasm can't manipulate externref directly, since it can't be put into linear memory; it has to be lifted into a table index by wasm_bindgen style glue).

If/when the wasm component model and canonical ABI (nee interface types and module linking) gets to at least stage 3 (prestandard implementation period), then both problems get a bit easier.

CAD97 avatar Jan 22 '23 00:01 CAD97

Thanks for the explanation!

Offering precompiled wasm binaries is technically a separate problem from supporting a wasm proc_macro bridge. We can use the (unstable implementation detail) RPC currently used directly,

Is that https://github.com/dtolnay/watt/issues/42 ?

I don't think that will help. If watt changes it's internal API, this will break the ABI with or without interoperable_api. If watt doesn't change it's internal API, extern "C" + #[repr(C)] is enough to keep the ABI stable.

But what about the dependencies, such as proc-macro2, syn, proc-macro-error or synstructure?

Statically linking them would make the result wasm quite large and harder to upgrade dependencies and impossible to even use [patch] to override the version.

That's why I think interoperable_api might be able to help here to make it easier to provide a stable ABI for dynamic linking.

P.S. I'm not very familiar with wasm, but it seems that doing dynamic linking in wasm now is also quite hard.

There have been discussions about supporting running proc macros in wasm natively.

BTW, is there any RFC for this?

NobodyXu avatar Jan 22 '23 02:01 NobodyXu

There's two asks you're making:

  • Native support for running proc macros in wasm. Higher level FFI has absolutely no impact on this; the RPC between the proc macro server and implementation (both rustc's and watt's) is already by necessity defined with today's tools.
  • Support for using partially precompiled proc macros, providing the capability to upgrade transitive proc macro dependencies without recompiling downstream crates. Higher level FFI doesn't help here, either.

The former makes the latter significantly more tractable, since you don't have to worry about multiple host targets, but conflating the two is of no help to anyone and only serves to make things more confusing.

The latter ask is fundamentally impossible. The upstream crate must have been compiled against the specific downstream crate version, unless the library crate was specifically designed for version agnostic FFI. Any API compatible but ABI incompatible change will break your dynamic link.

interopable_abi doesn't help. Even if you just force the Rust repr and extern ABI to be the interoperable one on the wasm32-rust-procmacro target, disable all inlining, and polymorphize all generics, even after the massive performance penalties[^1] you still have to deal with const fn, for which it's considered a nonbreaking change to change the behavior of, despite them feeding into the type system and impacting name resolution and WF.

[^1]: The performance penalty of watt running the proc macro in wasm is offset by the wasm bundle being compiled in release mode. Outlining and polymorphizing everything will at best make release mode perform no better than debug mode. Optimizations are essentially born from inlining and can't happen without it; keep in mind you need to keep std dynamic as well under this model, because everyone needs to be using the same std.

This also completely ignores the problem of feature flags. They're conventionally additive, but this is not enforced and in practice they're often not. A big example are proc_macro feature flags, which are perhaps irrelevant for this specific use case, but an easy and relevant example.

There're only two ways to support upgrading a transitive dependency:

  • Recompile everything downstream with the new dependency, or
  • The dependency has been designed to have a stable ABI.

Version agnostic dynamic linking is a completely separate problem space from wasm proc macros.

BTW, is there any RFC for this?

All of the talk so far around running proc macros in wasm has been informal. There's essentially nothing really to RFC, either; it's almost entirely an implementation problem, just like the existing Implementation. (The only part to realistically RFC are restrictions on what proc macros can do to work on wasm, or tangential changes taking advantage of wasm proc macros such as precompiled proc macros.)

CAD97 avatar Jan 22 '23 08:01 CAD97

@CAD97 I see that supporting upgrading a transitive dependency is quite hard and actually does not need interoperabl_api.

I guess it might ok size-wise with -Oz, lto, etc, but losing the ability to control the transitive dependencies of proc-macro crates breaks the expectation and it needs to be explained before rolling out.

I'm ok with statically linking everything, but just a bit concerned that the proc-macro crates has to re-compile and release a new version every time one of their dependencies change.

Well, maybe not every time, but only on new versions that fixes bug.

Though that still sounds like asking the author of the proc-macro crate to do a lot more than an ordinary lib crate.

All of the talk so far around running proc macros in wasm has been informal. There's essentially nothing really to RFC, either; it's almost entirely an implementation problem, just like the existing Implementation. (The only part to realistically RFC are restrictions on what proc macros can do to work on wasm, or tangential changes taking advantage of wasm proc macros such as precompiled proc macros.)

Is there anywhere to track the progress?

I think it will still need rustc support to make this easier, so that they don't have to maintain a separate workspace and compile it down to wasm, and support for checking the source code on docs.rs.

It also needs cargo to ship a wasm interpreter with it and probably a new crate type so that the wasm can be loaded without the shim crate, which also brings the question of whether to use interpreter or JIT mode.

IMO using interpreter mode or a single pass compilation into native dynlib is good enough for proc-macro unless it is used really frequently or somehow it is compiled with -O0 or in debugging mode.

The performance penalty of watt running the proc macro in wasm is offset by the wasm bundle being compiled in release mode. Outlining and polymorphizing everything will at best make release mode perform no better than debug mode. Optimizations are essentially born from inlining and can't happen without it;

I agree and I don't think performance is going to be a problem here. Even with -Oz, it probably still runs fast enough to not be the bottleneck of the build process.

And even less of a problem when both wasmtime and wasmer supports single-pass or cranelift JIT to speedup the execution, the only problem is whether it should be used.

keep in mind you need to keep std dynamic as well under this model, because everyone needs to be using the same std.

Ehh I forgot this, considering how many traits/functions in libstd, dynamic linking would really slow down things a lot.

NobodyXu avatar Jan 22 '23 10:01 NobodyXu

To be explicit: running proc macros on wasm has been discussed, but that's the extent of what's been practically discussed. Nothing about publishing wasm blobs to cratesio; solely just compiling and running proc macros like normal, except for the wasm target instead of the host triple. I.e. solely for the sandboxing/determinism benefits; the compilation model would be identical to today's. (Except that sccache like solutions would work better.)

Rustc would embed a simple wasm interpreter, or use an existing one via the standard C interface.

There's no one place to track progress because there's no real initiative for it yet.

CAD97 avatar Jan 22 '23 10:01 CAD97

@CAD97 Thanks for the information, that clears it up. Support compiling proc-macro crate to wasm would be great, even if it still compiles locally and cannot enjoy the speedup to the compilation time provided by watt.

NobodyXu avatar Jan 22 '23 10:01 NobodyXu