wasm-tools icon indicating copy to clipboard operation
wasm-tools copied to clipboard

Making wast parse results serializable

Open bakkot opened this issue 1 year ago • 5 comments

I'd love to use the parser and encoder from wast in JavaScript. The easiest way for that to work would be for the parsed module to be serializable and deserializable as JSON, so that the string representations can be passed to JS via wasm-bindgen. Could the various AST types be made serializable?

I know adding serde adds some heft, but I think it could be done conditionally with an off-by-default feature.

I tried doing this myself by sprinkling #[derive(Serialize, Deserialize)] everywhere, but got stuck making the lifetimes work inside the instructions macro. I don't have enough experience with rust to figure out how to get that working.

bakkot avatar Feb 28 '24 08:02 bakkot

Thanks for the report! I think this is related to https://github.com/bytecodealliance/wasm-tools/issues/1395 in that the end result may be similar. I think this would be good to add, but if we could match the preexisting structure that wabt emits I think that'd be neat to avoid having multiple verisons floating around.

alexcrichton avatar Feb 28 '24 15:02 alexcrichton

Oh and I should mention that a PR would be most welcome!

Do you need library support for this feature? If not I think it'd be best to just do it in the CLI by transforming the AST to a custom structure with #[derive]. I would hope that you won't need to apply #[derive] to all types in the wast crate but only the top-level ones dealing with *.wast directives, and that'll be easier when it's modeled as a different structure rather than directly #[derive]'d on all types.

alexcrichton avatar Feb 28 '24 15:02 alexcrichton

Do you need library support for this feature?

That was my hope - my goal is to use this in a JS package, by making a crate which exposes two functions:

  • one which takes a .wat string and gives back a serialized representation of the parse result from the wast crate, and
  • one which takes such a serialized representation and gives me the compiled wasm binary

and then compiling that crate to wasm and calling it from JS. That would allow me to easily write transforms of wasm programs in a JS library. Doing it in the CLI wouldn't get there as easily. (I guess I could probably compile the CLI to WASI and then use it as a library by using an appropriate virtual filesystem in WASI, but that's a lot more painful.)

I would hope that you won't need to apply #[derive] to all types in the wast crate but only the top-level ones

Not everything, certainly, but you do need to represent instructions, and there's a lot of those.

bakkot avatar Feb 28 '24 16:02 bakkot

Re: using wast2json (assuming that's the preexisting structure that wabt emits you're referring to), is there a way to go back from the JSON to .wat? If not it's not would only accomplish one of the directions, though I guess I could write the other one.

bakkot avatar Feb 28 '24 16:02 bakkot

Ah ok, I'm erroneously conflating these two use cases then. Your use case is quite different where you want a JSON representation of the entire AST of a module. The wast2json command only returns a JSON representation of the testing directives in a *.wast file, the actual wasm files themselves are all represented as binary wasm.

I'd be a little hesitant taknig an entire module through a JSON AST but if it works for your use case it's something we can support. I suspect though that it's going to be a lot of #[derive] and, yes, we'll want them to be conditionally turned on via a feature.

For your use case I don't think the CLI would work well yeah, your use case would want #[derive] on all types for sure.

alexcrichton avatar Feb 28 '24 17:02 alexcrichton