oxc check symbols and scopes after transformation

cc @overlookmotel how exactly do we want to check symbols and scopes?

Aug 07 '24 09:08 Boshen

Process

I suggest the following process:

Run parser on source to produce AST (AST v1).
Run semantic on AST.
Run transformer on AST.
Store ScopeTree and SymbolTable from after transform (Semantic v1).
Clone AST (making AST v2).
~~Traverse AST v2 and set all scope_id, symbol_id and reference_id fields to None. i.e. AST v2 is now returned to virgin state like it would have been if it had come fresh out of the parser.~~
Run semantic on AST v2. Store the new ScopeTree and SymbolTable (Semantic v2).
Compare scope_id, symbol_id and reference_id fields between AST 1 and AST 2. They should match.
Compare ScopeTree + SymbolTable v1 and v2. They should also match.

i.e. AST v2 and ScopeTree + SymbolTable v2 are how they should be. Make sure that state of both AST and Semantic after transform (v1) matches that.

The complication

The tricky part is what constitutes "matches".

For example if this is the input:

if (x) enum Foo {}
function f() {}

The output of transformer is:

if (x) {}
function f() {}

The scope IDs are:

Before transform:

// Scope ID 0
if (x) enum Foo { /* Scope ID 1 */ }
function f() { /* Scope ID 2 */ }

After transform:

// Scope ID 0
if (x) { /* Scope ID 3 */ } // <-- newly created scope
function f() { /* Scope ID 2 */ }

vs fresh semantic run on post-transform AST:

// Scope ID 0
if (x) { /* Scope ID 1 */ } // <-- numbered 1 as it's 2nd scope found in visitation order
function f() { /* Scope ID 2 */ }

Scope IDs of the if {} block are different in the 2 versions, because in the post-transform version, that scope was newly created in transformer.

But we don't want to throw an error on this because they are equivalent.

Likely same kind of thing will happen with SymbolIds and ReferenceIds.

Conclusion

So this is all a bit of a pain! But, if we can build this test infra, it will give us extremely good test coverage and we can be very confident that transformer is keeping scopes/symbols in sync correctly.

Side note: I'd imagine running these checks as part of transformer conformance, so we get instant feedback on PRs if they foul up scopes.

Aug 08 '24 19:08 overlookmotel

I would be happy to take this is on if you like, Boshen.

Aug 10 '24 11:08 overlookmotel

Just a note, we may need to check duplicate AstNodeId, ScopeId, and SymbolId in AST. related to https://github.com/oxc-project/oxc/issues/4809

Aug 10 '24 11:08 Dunqing

Borrowing the idea of the mangler, we can check whether symbols are the same in scope visitation order.

Aug 12 '24 06:08 Boshen

Just a note, we may need to check duplicate AstNodeId, ScopeId, and SymbolId in AST. related to #4809

This feels more related to #4804

Aug 12 '24 10:08 rzvxa

check symbols and scopes after transformation

Process

The complication

Suggested solution

Conclusion