wit-bindgen icon indicating copy to clipboard operation
wit-bindgen copied to clipboard

wit records defined with reserved rust keywords cannot be used with serde additional derives

Open kwhitehouse opened this issue 4 months ago • 4 comments

Summary of Issue

When defining a wit record with a reserved rust keyword, and making use of serde as an additional derives, the generated rust bindings.rs uses an _ suffix on field names which prevents serde from being able to correctly deserialize valid JSON blobs.

Steps to Reproduce

There's probably a simpler way to reproduce this, but I'm most familiar with components, so that's what I've documented here 😅

Create two new components

The following commands will create a new lib component and a new command component. Each of these will commands will generate a new directory structure in your current working directory.

% cargo component new --lib lib
% cargo component new main

Create a file bug-report.wit

Create this file within your current working directory. We will reference it in the following Cargo.toml files.

Note that the cargo component command will actually create a default wit file beneath the lib directory. You can delete this. In this simple POC, we'll use a single wit for both the lib component and the command component.

⚠️ Important: The key piece here is that we've a field within a record called type, which happens to be a reserved keyword in both wit and rust.

% cat bug-report.wit 
package component:bug-report;

interface example-interface {
    record record-with-keyword {
        %type: string,
    }

    hello-world: func() -> record-with-keyword;
}

world example-world-export {
  export example-interface;
}

world app {
  import example-interface;
}

Update lib

Update lib/Cargo.toml

% cat lib/Cargo.toml 
[package]
name = "lib"
version = "0.1.0"
edition = "2021"

[package.metadata.component.target]
path = "../bug-report.wit"
world = "example-world-export"

[dependencies]
bitflags = "2.5.0"
wit-bindgen-rt = "0.24.0"
serde_json = "1.0.115"
serde = { version = "1.0.197", features = ["derive"] }

[lib]
crate-type = ["cdylib"]

[package.metadata.component]
package = "component:bug-report"

[package.metadata.component.dependencies]

[package.metadata.component.bindings]
derives = ["serde::Serialize", "serde::Deserialize"]

Update lib/src/lib.rs

⚠️ Important: The key piece here is that we've a JSON blob that contains a field type. As we'll see, this fails to deserialize because the generated rust code expects a field named type_.

% cat lib/src/lib.rs 
#[allow(warnings)]
mod bindings;

use crate::bindings::exports::component::bug_report::example_interface::Guest;
use crate::bindings::exports::component::bug_report::example_interface::RecordWithKeyword;

struct Component;

impl Guest for Component {
    fn hello_world() -> RecordWithKeyword {
         let json = "
         {
             \"type\": \"expected field type is not present\"
         }";
         serde_json::from_str(json).unwrap()
    }
}

Update main

Update main/Cargo.toml

% cat main/Cargo.toml 
[package]
name = "main"
version = "0.1.0"
edition = "2021"

[package.metadata.component]
package = "component:bug-report-main"

[package.metadata.component.target]
path = "../bug-report.wit"
world = "app"

[package.metadata.component.dependencies]

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
bitflags = "2.5.0"
wit-bindgen-rt = "0.24.0"

Update main/src/main.rs

% cat main/src/main.rs 
#[allow(warnings)]
mod bindings;

use bindings::component::bug_report::example_interface::hello_world;

fn main() {
    hello_world();
    println!("Hello, world!");
}

Build

% (cd lib && cargo component build --release)
% (cd main && cargo component build --release)
% wasm-tools compose main/target/wasm32-wasi/release/main.wasm -d lib/target/wasm32-wasi/release/lib.wasm -o command.wasm

Run

% wasmtime run command.wasm

Result

You'll see a runtime failure complaining of a "missing field type_".

thread '<unnamed>' panicked at src/lib.rs:16:37:
called `Result::unwrap()` on an `Err` value: Error("missing field `type_`", line: 5, column: 10)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Error: failed to run main module `command.wasm`

This failure is a result of trying to deserialize a json blob, which contains a field named type, into a rust struct which expects a field named type_.

Root Cause

You'll notice that in lib/src/bindings.rs, the struct RecordWithKey contains a field named type_.

pub struct RecordWithKeyword {
      pub type_: _rt::String,
}

The type word is a reserved word in both wit and rust. In wit, we needed to prefix the field with the % character. In rust, adding the _ suffix obviously compiles, but it causes an issue with our serde deserialization, since serde now expects the json blob to also contain a key the exactly matches the field name, including the _ suffix.

Proposed Fix

For rust keywords, would it be possible to have the generated bindings.rs file use raw identifiers instead of the _ suffix? I believe this would resolve our issue, since serde should correctly manage rust raw identifiers.

I'm not sure how else to work around this problem - if there's a simple fix I can use on my end would love to learn about that for the short-term 😄

kwhitehouse avatar Apr 11 '24 07:04 kwhitehouse

Thanks for the report! Using raw identifiers would be one solution but I'd personally prefer to go the route of adding #[serde(rename)] annotations to fields that were renamed from their origin WIT. That being said that also raises a question as to whether we should #[serde(rename)] to the kebab-cased variants in WIT itself as opposed to using the snake-cased names in Rust...

alexcrichton avatar Apr 12 '24 17:04 alexcrichton

Would prefer rename instead of raw identifiers. We are also encountering same issue. It would be great if someway we can control mapping of individual fields as well

sehz avatar Apr 12 '24 18:04 sehz

I'm going to throw my hat in the ring for using raw identifiers. It's not the most pretty solution, but I do think it strikes the right balance of reflecting the wit identifiers while being idiomatic. The true idiomatic approach in Rust is to avoid using keywords as identifiers or to use alternate spellings (e.g., krate instead of crate, typ instead of type), and such an approach is impossible to generalize.

Allowing for #[serde(rename)] is definitely something we'll want in addition, but I think it's a separate feature, and I think in order to truly implement such a feature, we're likely to want annotation support in wit for specifying extra optional properties about bindings.

rylev avatar Apr 16 '24 08:04 rylev

That's a good point that a trailing underscore isn't idiomatic, so given that I think you're right that r#foo may be the way to go here instead.

alexcrichton avatar Apr 16 '24 20:04 alexcrichton