wasm-bindgen icon indicating copy to clipboard operation
wasm-bindgen copied to clipboard

Support enum variants with values

Open joeriexelmans opened this issue 3 years ago • 27 comments

Hi folks! This feature was discussed in the past in other issues, but it's been quiet for a while and there hasn't been an issue dedicated specifically to this feature, hence this:

Motivation

Right now, only C-style enums are supported. A C-style enum like this

#[wasm_bindgen]
enum CStyleEnum { A, B = 42, C }

is now converted to a mapping from tag to integer values:

const CStyleEnum = Object.freeze({ A: 0, B: 42, C: 43 });

In your code you can just use CStyleEnum.A, like you would in Rust.

Rust's support for Algebraic Data Types (ADTs) is an awesome feature that I love and use intensely. It would be nice if JavaScript bindings could be generated for an enum like this:

#[wasm_bindgen]
enum MyEnum {
    A,
    B(u32),
    C(u32, u32),
    D { x: u32, y: u32 },
}

Proposed Solutions

JavaScript does not have builtin support for "tagged unions" like Rust does, so a decision has to be made on how we "model" this. There was discussion on this in the earlier C-style enum issue, with 2 proposed solutions. I'll summarize them here.

Solution 1: Tag + array of values

This is the most straightforward solution. It reflects Rust's memory layout for enums.

Typescript definition:

type MyEnum =
  | { tag: 'A' }
  | { tag: 'B', value: number }
  | { tag: 'C', value: [number, number] }
  | { tag: 'D', value: { x: number, y: number }
  ;

I like how currently, for C-style enums, a mapping from enum identifier to an integer is created. We could further generalize this by creating a mapping from enum identifier to enum variant constructor:

const MyEnum_Tags = Object.freeze({ A: 0, B: 1, C: 2, D: 3 });
const MyEnum = Object.freeze({
  A: Object.freeze({ tag: MyEnum_Tags.A }), // just a value, like currently for C-style enums
  B: function() { this.tag = MyEnum_Tags.B; this.val = arguments }, // a constructor
  C: function() { this.tag = MyEnum_Tags.C; this.val = arguments },
  D: function() { this.tag = MyEnum_Tags.D; this.val = arguments },
});

and use it as follows:

 // C-style variant. Reflects usage in Rust, and current usage in JS
const a = MyEnum.A;

// Variants with values. Reflects how in Rust a variant with values is a function:
const b = new MyEnum.B(42); 
const c = new MyEnum.C(42, 43);
const d = new MyEnum.D({x: 42, y: 43});

JavaScript allows any function to be called with any list of parameters, so there's no way to prevent e.g. forgetting the number parameter for variant B, doing something like new MyEnum.B(), but that's just the way it is, in a dynamically typed language.

To "match" an enum variant, one can use JS switch-statements:

switch (variant.tag) {
case MyEnum_Tags.A:
  ...
case MyEnum_Tags.B:
  ...
}

if-else:

if (variant.tag === MyEnum_Tags.C) {
   const [x, y] = variant.val; // this is not too pretty, but values must be put under a field ("val"), in order to prevent naming collisions with the "tag" field.
   ...
}

or use a mapping:

const something = {
  [MyEnum_Tags.A]: ... ,
  [MyEnum_Tags.B]: ... ,
}[variant.tag];

Alternatives

Solution 2: Keep the tag as a key in the object

The following was also proposed:

type MyEnum =
  | { A: any }
  | { B: { '0': number } }
  | { C: { '0': number, '1': number } }
  | { D: { x: number, y: number } }
  ;

I'm personally not in favor, because one is forced to add a dummy value for variant 'A', which is inefficient:

const variant = { A: {} };

And this dummy value has to be truthy, in order for

if (variant.A) { ... }

to match.

Also, I think checking if an object contains a key is less efficient in (non-optimized?) JavaScript, compared to just checking the value of an integer field ("tag").

To "match" a variant, to my knowledge, one can only use if-statements.

if (variant.C) {
  const [x, y] = variant.C;
  ...
}

Additional info

I'm willing to put some work into this. Currently looking at the source of the wasm_bindgen macro to see if I can implement it myself.

joeriexelmans avatar Dec 29 '20 09:12 joeriexelmans

It seems reasonable to me to try to support this, thanks for writing this up! This is likely a pretty significant feature to implement, however, and I don't have a ton of time to help with design details myself. I'd be happy to help review though!

alexcrichton avatar Jan 04 '21 15:01 alexcrichton

Nice to hear. I will also be a bit busy the coming weeks, but I'll get at it, eventually. Need this feature for a problem I'm working on.

joeriexelmans avatar Jan 04 '21 16:01 joeriexelmans

Awesome to see someone else had the idea before me! I do like Approach 1, though I'm not sure if one would get the proper hints by the IDE when calling the constructors. Still, perfectly reasonable and would be a huge improvement!

iFreilicht avatar Jan 28 '21 13:01 iFreilicht

That's a great feature that will surely help a lot in one of my projects.

You're proposals are awesome, @joeriexelmans, but what do you think about following what the genType ReScript library does with their variants? This seems to work really well, and could help in building a ReScript <-> wasm-bindgen interop.

I'd be more than happy to work on this with you, and am also currently reading the wasm_bindgen macro to get a grisp of how it works.

celsobonutti avatar Feb 22 '21 14:02 celsobonutti

Hi @celsobonutti

From quickly reading the ReScript documentation, it seems to me that a downside is that matching a specific variant in JS is somewhat more complex, as a variant value may be either just a "string", or an object { tag, value }, for the same enum. Did I understand this correctly? Is there an upside I'm missing?

I'm still busy, until mid-March, so feel free to "take over" from me. I'll let you guys know soon as I've got time to work on this again.

joeriexelmans avatar Feb 22 '21 14:02 joeriexelmans

Hey @joeriexelmans

Yes, the bindings are indeed more complex. I got the wrong understanding about how C-like enums were compiled in wasm_bindgen and thought they were - like ReScript's - strings. Now that I saw how they're currently compiled, this change does not make sense for me anymore.

Sure, I'm pretty busy lately too, but I'll keep trying to understand the macro when I have free time. Thank you a lot for your quick response! 😃

celsobonutti avatar Feb 22 '21 20:02 celsobonutti

Hey @alexcrichton, how are you doing? I finally got some time to work on this (#2631), but looks like I had a misconception about WasmSlices (I thought I could have a slice with various ABI types) and now it looks to me that this feature would be way easier to implement using wasm multi-value returns. Is there any work on using multi-value in wasm-bindgen? I've searched for it but couldn't find anything 😞

celsobonutti avatar Jul 24 '21 15:07 celsobonutti

Hi everyone, what would be needed to unlock this feature?

jkomyno avatar Jun 20 '22 15:06 jkomyno

@jkomyno basically, the PR #2631 has to be rebased and made ready to merge.

iFreilicht avatar Jun 27 '22 15:06 iFreilicht

Is #2631 still relevant? I'm interested in this and would like to play around with it but starting from scratch might be too much. Getting a couple of pointers would also work.

voxelstack avatar Sep 01 '22 02:09 voxelstack

A common error handling pattern is to wrap errors from other libraries in an enum variant. Would be great to be able to return a JsValue error inside of our library's error enum, rather than try to map every single one to a unique variant.

If there's a better way to do this please let me know.

wilfredjonathanjames avatar Sep 12 '22 20:09 wilfredjonathanjames

@voxelstack Yes, kind of. It does work in that revision as far as I remember, so definitely your best starting point, but you'd have to rebase it onto master. Could be possible, could be a lot of work.

iFreilicht avatar Sep 29 '22 15:09 iFreilicht

Another option would be to treat non-c-style enums like structs, so they're wrapped in a class on the JavaScript side

bouk avatar Nov 04 '22 10:11 bouk

I think at a minimum, wasm-bindgen could copy what Serde currently lets you do. See this link: https://serde.rs/enum-representations.html, there are four possible representations:

// Externally tagged: enum variant is the key, enum content is the value
// Default representation
{"Request": {"id": "...", "method": "...", "params": {...}}}

// Internally tagged: the enum variant gets a key with the rest of the enum content
// `#[wasm_bindgen(tag = "type")]` on the Rust side
{"type": "Request", "id": "...", "method": "...", "params": {...}}

// Adjacently tagged: the enum content is represented within a separate key
// `#[wasm_bindgen(tag = "t", content = "c")]`
{"type": "Request", "c": {"id": "...", "method": "...", "params": {...}}}

// Untagged: no variant indicator, figure it out based on the content
// `#[wasm_bindgen(untagged)]`
{"id": "...", "method": "...", "params": {...}}

I think that there needs to be a couple possible representations, and mimicking the serde style gives a familiar (and proven) way to do that. In any case, exporting the enum variant constructors as js functions (e.g. MyEnum.C(42, 43)) also sounds like a good idea.

For anyone who needs a fast workaround, using serde_wasm_bindgen lets you pass enums to and from js like the examples above.

tgross35 avatar Mar 18 '23 06:03 tgross35

The last one (untagged) would not work for enum where some variants have values and others not and the first one might lead. The first one (internally tagged) may lead to problems with field name conflicts and as per serde documentation does not work with enums containing tuple variants.

So I think the second one (adjacently tagged) or the default which you did not mention (externally tagged) is best suited.

septatrix avatar Mar 18 '23 12:03 septatrix

I did miss externally tagged, thanks for pointing that out.

The point of my comment was more that I don't think wasm_bindgen does need to pick the "best" representation - instead it needs options, since no single representation is good in every case.

tgross35 avatar Mar 19 '23 04:03 tgross35

Although I agree that multiple options would be the best, I think we need to focus on getting the feature working first. TS/JS representation is something we can do solely on the JS side, which I believe is easier to modify than actually passing the necessary data back and forth.

I, unfortunately, still have no time to continue working on my PR, but would love to be involved if someone can.

celsobonutti avatar Mar 19 '23 14:03 celsobonutti

Is there a way to see if a JsValue is of an enum member declared in the rust code as of now or is everything like that to-do?

I have an enum that is used in the js code and then passed to the rust code on several other occasions, but I can't use the enum in the functions argument.

Anatoly03 avatar May 29 '23 16:05 Anatoly03

What’s the status on this? I think I need this implemented, unless there’s other alternatives/workarounds

ThatXliner avatar Sep 01 '23 20:09 ThatXliner

@ThatXliner as a general rule, if there are no other comments on an issue, the status hasn't changed.

A comment like "What's the status?" is rarely productive and notifies everyone in the conversation, causing noise and distraction, please consider not posting comments like that in the future in highly active repositories.

If you think someone should see this issue, send it to them or mention them here, ideally with an explanation.

If you want to express your support or desire for this feature, upvote the original post.

If you want to show how this feature is useful in a way that hasn't been considered in this thread yet, feel free to post a minimal example. Something like this can be useful for generating test cases.

Thank you.

iFreilicht avatar Sep 02 '23 12:09 iFreilicht

Correct me if I am wrong, but I believe this is just waiting on somebody to implement it (reads this as: it could be you!). It doesn’t seem like it would all that much work, especially since SerDe provides a good reference.

tgross35 avatar Sep 02 '23 20:09 tgross35

Probably needs a reviewer, I'm afraid this will be too big for me to find time to review. E.g. I didn't find the time to review https://github.com/rustwasm/wasm-bindgen/pull/3554 yet.

daxpedda avatar Sep 03 '23 17:09 daxpedda

Correct me if I am wrong, but I believe this is just waiting on somebody to implement it.

True, but so far nobody has done it. There was an attempt in https://github.com/rustwasm/wasm-bindgen/pull/2631, but the original author had no time to work on it further and it was never merged.

iFreilicht avatar Oct 02 '23 13:10 iFreilicht

Hey all, I've recently come across a need for this feature and I'm also between jobs and so have some spare time to work on something like this. However I'm not sure I understand what benefits this approach would provide over the one outlined in the documentation for communicating arbitrary data via serde_wasm_bindgen https://rustwasm.github.io/wasm-bindgen/reference/arbitrary-data-with-serde.html.

Especially since Rust structs exported to JS must be Copy https://rustwasm.github.io/wasm-bindgen/reference/types/exported-rust-types.html so any enum with String is out of the question and the serde_wasm_bindgen approach does not have this problem.

The following example appears to work just fine, and since the implementations of WasmDescribe, IntoWasmAbi, and FromWasmAbi are all mechanical delegations to serde_wasm_bindgen they could be easily produced with a simple macro

#[derive(Serialize, Deserialize)]
#[serde(tag = "type", content = "data")]
pub enum TestEnum {
    Foo { number: i32 },
    Bar { string: String },
}

impl From<TestEnum> for JsValue {
    fn from(val: TestEnum) -> Self {
        serde_wasm_bindgen::to_value(&val).unwrap()
    }
}

impl wasm_bindgen::describe::WasmDescribe for TestEnum {
    fn describe() {
        JsValue::describe()
    }
}

impl wasm_bindgen::convert::IntoWasmAbi for TestEnum {
    type Abi = <JsValue as IntoWasmAbi>::Abi;

    fn into_abi(self) -> Self::Abi {
        serde_wasm_bindgen::to_value(&self).unwrap().into_abi()
    }
}

impl wasm_bindgen::convert::FromWasmAbi for TestEnum {
    type Abi = <JsValue as FromWasmAbi>::Abi;

    unsafe fn from_abi(js: Self::Abi) -> Self {
        serde_wasm_bindgen::from_value(JsValue::from_abi(js)).unwrap()
    }
}

#[wasm_bindgen(js_name = "newTestEnum")]
pub fn new_test_enum() -> TestEnum {
    TestEnum::Foo { number: 42 }
}

#[wasm_bindgen(js_name = "printTestEnum")]
pub fn print_test_enum(test_enum: TestEnum) {
    match test_enum {
        TestEnum::Foo { number } => console_log!("Foo encountered with value: {}", number),
        TestEnum::Bar { string } => console_log!("Bar encountered with value: {}", string),
    }
}

The main limitation I can think of is that it won't produce Typescript type definitions for the Enum, you can define them manually https://rustwasm.github.io/wasm-bindgen/reference/attributes/on-js-imports/typescript_type.html, but keeping them in sync would be a pain. However I'm not sure this limitation warrants the effort. Maybe it'd be enough to outline this approach for enums in the documentation?

DylanRJohnston avatar Jan 15 '24 02:01 DylanRJohnston

Here's a procedural macro for it, the macro attributes get passed directly to serde so you can customize the the tagging convention.

// lib.rs
use proc_macro::TokenStream;

mod serde_wasm_bindgen;

#[proc_macro_attribute]
pub fn serde_wasm_bindgen(attr: TokenStream, item: TokenStream) -> TokenStream {
    serde_wasm_bindgen::expand_macro(attr.into(), item.into())
        .unwrap_or_else(syn::Error::into_compile_error)
        .into()
}
// serde_wasm_bindgen.rs
use proc_macro2::TokenStream;
use quote::{quote, ToTokens};
use syn::{spanned::Spanned, Error, Item};

pub fn expand_macro(serde_attr: TokenStream, tokens: TokenStream) -> syn::Result<TokenStream> {
    let item = syn::parse2::<Item>(tokens)?;
    match item {
        Item::Struct(it) => inner_macro(it.ident.clone(), serde_attr, it.to_token_stream()),
        Item::Enum(it) => inner_macro(it.ident.clone(), serde_attr, it.to_token_stream()),
        item => Err(Error::new(
            item.span(),
            "serde_wasm_bindgen macro can only be applied to structs or enums",
        )),
    }
}

fn inner_macro(
    ident: proc_macro2::Ident,
    serde_attr: TokenStream,
    tokens: TokenStream,
) -> Result<TokenStream, Error> {
    let pound = syn::Token![#](tokens.span()).to_token_stream();

    Ok(quote! {
      #pound[derive(Serialize, Deserialize)]
      #pound[serde(#serde_attr)]
      #tokens


      impl From<#ident> for JsValue {
        fn from(val: #ident) -> Self {
            serde_wasm_bindgen::to_value(&val).unwrap()
        }
      }

      impl TryFrom<JsValue> for #ident {
        type Error = serde_wasm_bindgen::Error;

        fn try_from(value: JsValue) -> Result<Self, Self::Error> {
            serde_wasm_bindgen::from_value(value)
        }
      }

      impl wasm_bindgen::describe::WasmDescribe for #ident {
        fn describe() {
            JsValue::describe()
        }
      }

      impl wasm_bindgen::convert::IntoWasmAbi for #ident {
        type Abi = <JsValue as IntoWasmAbi>::Abi;

        fn into_abi(self) -> Self::Abi {
            Into::<JsValue>::into(self).into_abi()
        }
      }

      impl wasm_bindgen::convert::FromWasmAbi for #ident {
        type Abi = <JsValue as FromWasmAbi>::Abi;

        unsafe fn from_abi(js: Self::Abi) -> Self {
            JsValue::from_abi(js).try_into().unwrap()
        }
      }
    })
}
#[serde_wasm_bindgen(tag = "type", content = "data")]
pub enum TestEnum {
    Foo { number: i32 },
    Bar { string: String },
}

#[wasm_bindgen(js_name = "printTestEnum")]
pub fn print_test_enum(test_enum: TestEnum) {
    match test_enum {
        TestEnum::Foo { number } => console_log!("Foo encountered with value: {}", number),
        TestEnum::Bar { string } => console_log!("Bar encountered with value: {}", string),
    }
}

I made enum -> JsValue infallible as I don't think this can panic, however JsValue -> enum definitely can fail, so I've implemented TryFrom. However for the FromWasmAbi if someone has called this function with the wrong type I don't think there's anything we can sensibly do other than panic (hence the unwrap).

DylanRJohnston avatar Jan 15 '24 03:01 DylanRJohnston

@DylanRJohnston I've created a macro that does this for me. It generates TypeScript based on serde and uses serde_wasm_bindgen for ABI conversions. Here' a blog post, the crate (/bomboni_wasm), and an example.

tinrab avatar Jan 19 '24 21:01 tinrab

It generates TypeScript based on serde and uses serde_wasm_bindgen for ABI conversions.

FWIW there's also tsify, which does this as well.

RReverser avatar Jan 30 '24 16:01 RReverser