check symbols and scopes after transformation
cc @overlookmotel how exactly do we want to check symbols and scopes?
Process
I suggest the following process:
- Run parser on source to produce AST (AST v1).
- Run semantic on AST.
- Run transformer on AST.
- Store
ScopeTreeandSymbolTablefrom after transform (Semantic v1). - Clone AST (making AST v2).
- ~~Traverse AST v2 and set all
scope_id,symbol_idandreference_idfields toNone. i.e. AST v2 is now returned to virgin state like it would have been if it had come fresh out of the parser.~~ - Run semantic on AST v2. Store the new
ScopeTreeandSymbolTable(Semantic v2). - Compare
scope_id,symbol_idandreference_idfields between AST 1 and AST 2. They should match. - Compare
ScopeTree+SymbolTablev1 and v2. They should also match.
i.e. AST v2 and ScopeTree + SymbolTable v2 are how they should be. Make sure that state of both AST and Semantic after transform (v1) matches that.
The complication
The tricky part is what constitutes "matches".
For example if this is the input:
if (x) enum Foo {}
function f() {}
The output of transformer is:
if (x) {}
function f() {}
The scope IDs are:
Before transform:
// Scope ID 0
if (x) enum Foo { /* Scope ID 1 */ }
function f() { /* Scope ID 2 */ }
After transform:
// Scope ID 0
if (x) { /* Scope ID 3 */ } // <-- newly created scope
function f() { /* Scope ID 2 */ }
vs fresh semantic run on post-transform AST:
// Scope ID 0
if (x) { /* Scope ID 1 */ } // <-- numbered 1 as it's 2nd scope found in visitation order
function f() { /* Scope ID 2 */ }
Scope IDs of the if {} block are different in the 2 versions, because in the post-transform version, that scope was newly created in transformer.
But we don't want to throw an error on this because they are equivalent.
Likely same kind of thing will happen with SymbolIds and ReferenceIds.
Suggested solution
Traverse AST v1 and AST v2 in tandem and:
- Check that every node which has a
ScopeIdin AST v1 also has one in AST v2, and vice versa. - Build a hash map mapping from v1
ScopeIdto v2ScopeId. So, in example above v1 Scope ID 3 -> v2 Scope ID 1.
When comparing v1 ScopeTree to v2 ScopeTree, use the mapping to translate IDs before checking the scope trees match.
Ditto for SymbolIds and ReferenceIds.
Conclusion
So this is all a bit of a pain! But, if we can build this test infra, it will give us extremely good test coverage and we can be very confident that transformer is keeping scopes/symbols in sync correctly.
Side note: I'd imagine running these checks as part of transformer conformance, so we get instant feedback on PRs if they foul up scopes.
I would be happy to take this is on if you like, Boshen.
Just a note, we may need to check duplicate AstNodeId, ScopeId, and SymbolId in AST. related to https://github.com/oxc-project/oxc/issues/4809
Borrowing the idea of the mangler, we can check whether symbols are the same in scope visitation order.
Just a note, we may need to check duplicate AstNodeId, ScopeId, and SymbolId in
AST. related to #4809
This feels more related to #4804