Make `AstKind` consistent
To support printing comments in formatter and codegen, we are discussing adding a CommentNodeId field to every AST type (#11213).
However, what we really want is standard NodeId on every AST node.
Why?
At present, when you have an AstNode, via AstKind you can get the actual AST type (e.g. BinaryExpression). However you can't go in the opposite direction e.g. start with a BinaryExpression, obtain its NodeId, and use Semantic to e.g. get its parent node.
If we store NodeId in all AST structs, we'll be able to go in both directions, which allows moving around and querying the AST much more easily.
- https://github.com/oxc-project/oxc/discussions/5689
- https://github.com/oxc-project/oxc/issues/11048
What does AstKind have to do with this?
The main blocker to achieving this is that AstKind is rather inconsistent:
- Most AST types which are structs have a corresponding
AstKind, but some don't. - Most AST types which are enums do not have a corresponding
AstKind, but a few do!
Only types which have an AstKind get a NodeId. So, to align NodeId in Semantic with NodeIds stored in AST structs, we need to standardize what types have AstKinds.
What we need to do
We need to:
- Add
AstKinds for all AST structs (e.g.StaticMemberExpression). - Remove
AstKinds for all AST enums (e.g.MemberExpression).
How to do it
oxc_ast_tools contains 2 lists of the exceptions to the above rules:
https://github.com/oxc-project/oxc/blob/7e884511bda981f85ff3472f748134ab60c3aec8/tasks/ast_tools/src/generators/ast_kind.rs#L28-L83
We need to reduce both these lists to zero.
We can approach this goal in small steps.
e.g. for MemberExpression:
- Add
AstKinds forComputedMemberExpression,StaticMemberExpressionandPrivateFieldExpression(by removing them fromSTRUCTS_BLACK_LIST). - Remove
AstKindforMemberExpression(by removing it fromENUMS_WHITE_LIST). - Run
cargo run -p oxc_ast_toolsto regenerate the definition ofAstKindandVisitetc. - Update all code which used the old
AstKind::MemberExpressionto use the newAstKind::StaticMemberExpressionetc instead.
The changes will be mostly in the linter, which is the heaviest user of AstKind.
e.g. in eslint/no-caller rule:
https://github.com/oxc-project/oxc/blob/7e884511bda981f85ff3472f748134ab60c3aec8/crates/oxc_linter/src/rules/eslint/no_caller.rs#L77-L85
- if let AstKind::MemberExpression(MemberExpression::StaticMemberExpression(expr)) = node.kind() {
+ if let AstKind::StaticMemberExpression(expr) = node.kind() {
If anyone would like to join in this effort, please comment below with which types you plan to work on, to avoid duplicated work.
Note: The MemberExpression example above is probably one of the harder ones. Adding AstKinds for many of the other types will likely be much easier. e.g. JSXOpeningFragment and other JSX types will probably require far fewer code changes. Those are likely a good place to start.
#11535 is my attempt to do this for all jsx types.
#11597 creates AstKinds for all jsdoc types
Just one note: I realized there is one struct we do need to keep in the black list: Span. That's a special case. It's not an AST node, so it shouldn't have an AstKind (and it'd be a major slowdown if it did).
#11617 removes ForStatementInit from AstKind
If anyone would like to try getting rid of AstKind::ArrayExpressionElement, that'd be a big help.
It's the only blocker on implementing GetAddress on AstKind, because ArrayExpressionElement has 1 variant (Elision) which isn't boxed. GetAddress on enums requires that all variants are Box<'a, Something>, so it's not implemented on ArrayExpressionElement.
If anyone would like to try getting rid of AstKind::ArrayExpressionElement, that'd be a big help.
Done in #11684
I have opened PRs to add AstKind to the following:
- TSTypePredicate #11726
- TSIndexSignature #11724
- TSCallSignatureDeclaration #11725
And a PR for removing AstKind from:
- TSModuleReference #11732
I'm planning to work on TSArrayType.
I'm going to try and start working on ComputedMemberExpression. It will take me a little while, as it looks like there are 24 lint rules that need to be rewritten. If I can't finish, I'll put up a draft PR someone else can take over. I'm going to work on adding it as its own AstKind.
ComputedMemberExpression is done: #11766, I'm now working on StaticMemberExpression: #11767.
Okay, so it turns out that untangling StaticMemberExpression is a lot. It's kind of looking like I might just need to do MemberExpression as a whole. Hopefully our test coverage is comprehensive enough to catch any mistakes. I'll see what I can do.
I'm gonna take a stab at removing AstKind for AssignmentTarget
Actually, I don't think we should add an AstKind for BindingPattern. Plan is to turn BindingPattern into an enum - see https://github.com/oxc-project/oxc/issues/11489#issuecomment-2946791520.
So at that point, it'd lose its AstKind again. BindingPattern is quite ubiquitous, so probably it would involve a lot of work both to add an AstKind, and to remove it again. So I suggest we spend the time turning it into an enum instead (though that's not going to be trivial either).
@camchenry @taearls @ulrichstark @therewillbecode @camc314 Sorry to spam you all but just want to make sure no-one embarks on what'd be a lot of work, and has their time wasted. I should have spotted this before - glad no-one did it already!
If no one else has started on it (atm don't see any linked PRs), I can try removing AstKind from SimpleAssignmentTarget over the next couple of days.
That'd be brilliant. I don't believe anyone else is working on SimpleAssignmentTarget, but please speak up if anyone is.
That and Argument (#12127) are the last ones. Then just BindingPattern to sort out, and we're done!
AstKind::Argument is gone! Champagne for everyone!
Now just BindingPattern to do...