oxc
oxc copied to clipboard
Implement `serde::Serialize` on AST types via `#[generate_derive]`
We currently use serde
's derive macros to implement Serialize
on AST types.
We could use #[generate_derive]
to generate these impls instead.
Why is that a good thing?
1. Reduce compile time
serde
's macro is pretty expensive at compile time for the NAPI build. We can remove it.
2. Reduce boilerplate
serde
's derive macro is less powerful than ast_tools
. Because Serialize
is a macro, all it knows about is the type that #[derive(Serialize)]
is on. Whereas ast_tools
builds a schema of the entire AST, so it knows not just about the type it's deriving impl for, but also all the other types too, and how they link to each other.
Currently we have to put #[serde]
attributes everywhere:
#[ast]
#[cfg_attr(feature = "serialize", derive(Serialize, Tsify))]
#[serde(tag = "type")]
pub struct ClassBody<'a> {
#[serde(flatten)]
pub span: Span,
pub body: Vec<'a, ClassElement<'a>>,
}
#[ast]
#[cfg_attr(feature = "serialize", derive(Serialize, Tsify))]
#[serde(tag = "type")]
pub struct PrivateIdentifier<'a> {
#[serde(flatten)]
pub span: Span,
pub name: Atom<'a>,
}
#[ast]
#[cfg_attr(feature = "serialize", derive(Serialize, Tsify))]
pub struct Span {
start: u32,
end: u32,
}
Instead, we can use ast_tools
in 2 ways to remove this boilerplate:
- Make things that we implement on every type the defaults, so they don't need to be stated over and over.
- Use
ast_tools
's knowledge of the whole AST to move the instruction to flattenSpan
ontoSpan
type itself. "flatten this" instruction does not need to be repeated on every type that containsSpan
.
#[ast]
#[generate_derive(ESTree)]
pub struct ClassBody<'a> { // <-- no `#[serde(tag = "type")]` attr
pub span: Span, // <-- no `#[serde(flatten)]` attr
pub body: Vec<'a, ClassElement<'a>>,
}
#[ast]
#[generate_derive(ESTree)]
pub struct PrivateIdentifier<'a> { // <-- no `#[serde(tag = "type")]` attr
pub span: Span, // <-- no `#[serde(flatten)]` attr
pub name: Atom<'a>,
}
#[ast]
#[generate_derive(ESTree)]
#[estree(flatten)] // <-- `flatten` is here now
pub struct Span {
start: u32,
end: u32,
}
I think this is an improvement. How types are serialized is not core to the function of the AST. I don't see moving the serialization logic elsewhere as "hiding it away", but rather a nice separation of concerns.
3. Open the door to different serializations
In example above Serialize
has been replaced by ESTree
. This is to allow for different serialization methods in future. For example:
Different serializers for plain JS AST and TS AST
When serializing a plain JS file, could produce JSON which skips all the TS fields, to make an AST which exactly aligns with canonical ESTree. We'd add #[ts]
attribute to all TS-related fields, and ESTreeJS
serializer would skip those fields. This would make the AST faster to deserialize on JS side.
The other advantage is the TS-less AST should perfectly match classic ESTree, so we can test it in full using Acorn's test suite.
Users who are not interested in type info can also request the cheaper JS-only AST, even when parsing TS code.
Serialize to other AST variants
e.g. #[generate_derive(Babel)]
to serialize to a Babel-compatible JSON AST.
const {program} = parse(code, {flavor: 'babel'});
Not sure if this is useful, but this change makes it a possibility if we want to.
4. Simplify implementation of custom serialization
Currently we have pretty complex custom Serialize
impls for massaging Oxc's AST into ESTree-compatible shape in oxc_ast/src/serialize.rs.
We can remove most of them if we use ast_tools
to generate Serialize
impls for us, guiding it with attributes on the AST types themselves:
#[ast]
#[generate_derive(ESTree)]
pub struct ObjectPattern<'a> {
pub span: Span,
pub properties: Vec<'a, BindingProperty<'a>>,
#[estree(append_to_previous)]
pub rest: Option<Box<'a, BindingRestElement<'a>>>,
}
5. Simply AST transfer code
AST transfer's JS-side deserializer (and eventually serializer too) can be simplified in same way, generating code for JS-side deserializer which matches the Rust-side one exactly, without writing the same logic twice and having to keep them in sync.
6. TS type generation
What "massaging" of the Rust AST we do to turn it into an ESTree-compatible JSON AST is now encoded as static attributes. We can use this to generate TS types, and we can get rid of Tsify
.
How difficult is this?
serde
's derive macro looks forbiddingly complex. But this is because it handles every conceivable case, almost all of which we don't use. The output it generates for our AST types is actually not so complicated.
So creating a codegen for impl Serialize
I don't think would be too difficult.