oxc icon indicating copy to clipboard operation
oxc copied to clipboard

Refactor `BindingPattern` to remove weirdness

Open overlookmotel opened this issue 6 months ago • 6 comments

Our JS-side AST for TypeScript files is now aligned with TS-ESLint (#9705).

However, the generated JSON isn't quite right - it contains repeated properties.

e.g. the AST for let [x] = y; contains repeated optional and typeAnnotation properties:

{
  "type": "ArrayPattern",
  "start": 4,
  "end": 7,
  "decorators": [],
  "elements": [
    {
      "type": "Identifier",
      "start": 5,
      "end": 6,
      "decorators": [],
      "name": "x",
      "optional": false,
      "typeAnnotation": null,
      "typeAnnotation": null,
      "optional": false
    }
  ],
  "optional": false,
  "typeAnnotation": null,
  "typeAnnotation": null,
  "optional": false
}

Playground (tick the "Raw" box to see the raw JSON)

Currently the TS-ESTree conformance tests are designed to ignore this. But the problem is widespread - when the duplicate fields are not ignored, conformance pass rate goes down from 99.9% to only 20%.

This is not a critical problem because JSON.parse deals with the duplicate properties, so everything works. However it very likely is making JSON.parse slower than it should be.

When using raw transfer, the same kind of thing is happening - properties are being initialized twice. This is very likely slowing down raw transfer a lot.

The cause

The main cause is BindingPattern.

It has its own optional and type_annotation fields:

https://github.com/oxc-project/oxc/blob/7e884511bda981f85ff3472f748134ab60c3aec8/crates/oxc_ast/src/ast/js.rs#L1524-L1536

However, it also merges in the properties of BindingPatternKind:

https://github.com/oxc-project/oxc/blob/7e884511bda981f85ff3472f748134ab60c3aec8/crates/oxc_ast/src/ast/js.rs#L1541-L1553

Each of the variants of BindingPatternKind also contains optional and typeAnnotation fields.

https://github.com/oxc-project/oxc/blob/7e884511bda981f85ff3472f748134ab60c3aec8/crates/oxc_ast/src/ast/js.rs#L261-L278

We can't remove these fields from BindingIdentifier, because they need to be present in the other places where BindingIdentifier is used e.g. Function::id, Class::id, ImportSpecifier::local.

There may also be other places with duplicate fields, but BindingPattern is definitely the main cause.

The problems with BindingPattern

BindingPattern is a real oddity in the AST:

  • It's the only AST node which doesn't have its own Span.
  • It's the only case where an enum has to be flattened in ESTree AST.
  • It results in unintutive spans e.g. the BindingIdentifier for x in let x: T = foo(); has a span of length 4 (covering x: T, not just x).

In short, it's weird! It behaves unlike any other type in the AST.

Working around these oddities complicates code in various places in the codebase, and has been a source of bugs in the past. It'll probably continue to be a source of bugs in future.

Additionally, there are also places in the AST where we're using BindingPattern but it's not legal to have type annotations. e.g. this is not legal syntax:

let [x: number, ...y: Array<number>] = arr;

TS playground

Because BindingPattern is so common in AST, if we can avoid the extra 16 bytes for type_annotation and optional fields where they're always None and false, it will be a measurable memory saving.

How to fix?

We could add custom serializers for BindingPattern, BindingIdentifier, ObjectPattern, ArrayPattern etc to solve the problem in TS-ESTree AST. However, because they're so common in the AST, this would be really complicated and messy.

I feel this would be treating the symptom rather than the cause.

In my opinion, the best approach is to bite the bullet and alter the Rust types - remove the weirdness at source, rather than adding a further layer of workarounds on top of it.

What would this look like?

I'm not sure exactly what would be required, but possibly we'd need something like:

  • Separate BindingIdentifier and TypedBindingIdentifer AST types.
  • Only TypedBindingIdentifer would have type_annotation and optional fields.
  • Other AST types use whichever type is legal in each situation.

It may also be possible to avoid the need for a TypedBindingIdentifer type and instead move these fields to types which contain a BindingPattern e.g. VariableDeclarator.

What's the downside?

BindingPattern and friends are widely used. Changing them will require updating a lot of downstream code.

When?

As noted above, this is not a critical problem for ESTree serialization, just a perf hit.

However, it may prove a blocker to implementing an AST walker with lazy deserialization, because there you need to be able to deserialize each field individually. So having 2 possible sources for the value of a field isn't workable.

I am hoping I can find another way around that problem. Assuming I can, I propose we push this change back to a later date, unless it's unavoidable.

overlookmotel avatar Jun 04 '25 19:06 overlookmotel

NOTE: Currently, the duplicate keys that can be observed in the output ESTree JSON are as follows:

  • Identifier: decorators
  • Identifier: optional
  • Identifier: typeAnnotation
  • ArrayPattern: decorators
  • ArrayPattern: optional
  • ArrayPattern: typeAnnotation
  • ObjectPattern: decorators
  • ObjectPattern: optional
  • ObjectPattern: typeAnnotation
  • AssignmentPattern: decorators
  • AssignmentPattern: optional
  • AssignmentPattern: typeAnnotation

leaysgur avatar Jun 05 '25 02:06 leaysgur

Thanks for checking. So sounds like it does all relate to BindingPattern.

overlookmotel avatar Jun 05 '25 08:06 overlookmotel

Discussed with some of core team just now. I'm going to investigate and try to come up with a new set of Rust types which solves the various problems discussed above. We can then evaluate it and decide if we want to move forward with it (at some point).

overlookmotel avatar Jun 05 '25 10:06 overlookmotel

#11500 fixes the duplicate properties in TS-ESTree AST. I've used custom serializers, which is an unwelcome complication, but it solves that specific problem.

Leaving this open for refactoring the Rust types.

overlookmotel avatar Jun 05 '25 14:06 overlookmotel

Actually, I think changes required to Rust types are much simpler than I'd initially thought.

Observations

There are only 4 "entry points" into BindingPattern (excluding BindingPatterns within BindingPatterns):

  • VariableDeclarator
  • CatchParameter
  • FormalParameter
  • FormalParameters rest field

It's only the outermost BindingPattern which can have a type annotation.

  • const [x]: T = foo(); is valid.
  • const [x: T] = foo(); is not.

Changes

  1. Remove BindingPattern.

  2. Rename BindingPatternKind to BindingPattern (so BindingPattern is now an enum).

  3. Add type_annotation and optional fields to the "entry points".

pub struct VariableDeclarator<'a> {
    pub span: Span,
    pub kind: VariableDeclarationKind,
    pub id: BindingPattern<'a>,
+   #[ts]
+   pub type_annotation: Option<Box<'a, TSTypeAnnotation<'a>>>,
+   // Cannot be optional, so no `optional` field
    pub init: Option<Expression<'a>>,
    pub definite: bool,
}

pub struct CatchParameter<'a> {
    pub span: Span,
    pub pattern: BindingPattern<'a>,
+   #[ts]
+   pub type_annotation: Option<Box<'a, TSTypeAnnotation<'a>>>,
+   // Cannot be optional, so no `optional` field
}

pub struct FormalParameter<'a> {
    pub span: Span,
    pub decorators: Vec<'a, Decorator<'a>>,
    pub pattern: BindingPattern<'a>,
+   #[ts]
+   pub type_annotation: Option<Box<'a, TSTypeAnnotation<'a>>>,
+   #[ts]
+   pub optional: bool,
    pub accessibility: Option<TSAccessibility>,
    pub readonly: bool,
    pub r#override: bool,
}

pub struct FormalParameters<'a> {
    pub span: Span,
    pub kind: FormalParameterKind,
    pub items: Vec<'a, FormalParameter<'a>>,
-   pub rest: Option<Box<'a, BindingRestElement<'a>>>,
+   pub rest: Option<Box<'a, FormalParameterRest<'a>>>,
}

+pub struct FormalParameterRest<'a> {
+   pub span: Span,
+   pub rest: BindingRestElement<'a>,
+   #[ts]
+   pub type_annotation: Option<Box<'a, TSTypeAnnotation<'a>>>,
+   // Cannot be optional, so no `optional` field
+}

Job done!

Conclusion

This would reduce the size of VariableDeclarator and FormalParameter by 8 bytes. BindingPattern shrinks by 16 bytes.

This seems fairly simple. Am I missing something?

overlookmotel avatar Jun 05 '25 23:06 overlookmotel

Double checked with tsc, your conclusion seems correct.

Boshen avatar Jun 06 '25 01:06 Boshen

I tried getting claude to look at this. But after looking at it's work, I think we need to make further AST changes.

I think we need to make further changes. Currently it's kind of weird with FormalParameter having the type_annotation, but that goes between the BindingPattern::AssignmentPattern components. For example:

const [x = 1] = arr;
//     ^^^^^^ assignment pattern, but type annotations aren't allowed here
const { a: x2 = 1 } = obj;
//      ^^^^^^^^^ assignment pattern, but type annotations aren't allowed here
function f(x3: number = 1) {}
//         ^^^^^^^^^^^^^^ only place where assignment pattern can have a type annotation
//         x3: number
//         ^^^^^^^^^^^ type annotation goes between identifier and initializer

I think we should follow TypeScript's AST approach and make FormalParameter have an initializer property (Option<Expression>), and create a new enum for FormalParameter's binding pattern which omits AssignmentPattern. This way:

  • The type annotation belongs to FormalParameter (as it currently does)
  • The initializer also belongs to FormalParameter (making the structure: identifier → type annotation → initializer)
  • AssignmentPattern is only used in destructuring contexts where type annotations aren't allowed anyway This would better reflect the actual grammar and avoid the weird nesting where type annotations conceptually split an AssignmentPattern.

camc314 avatar Nov 20 '25 16:11 camc314

This seems fairly simple. Am I missing something?

It seems the answer is yes! Oh lord...

overlookmotel avatar Nov 20 '25 23:11 overlookmotel

This seems fairly simple. Am I missing something?

It seems the answer is yes! Oh lord...

Good news! I've almost finished this

Bad news! You've got to review it, and it's huge 🙂

camc314 avatar Nov 20 '25 23:11 camc314

Oh man, I'm never going to get the rule-tester done...

overlookmotel avatar Nov 20 '25 23:11 overlookmotel