ast-types icon indicating copy to clipboard operation
ast-types copied to clipboard

[WorkInProgress] NodeVisitor for quick and simple traversals

Open drslump opened this issue 10 years ago • 5 comments

Derived from the discussion on #80 it seemed clear to me that even if optimising the PathVisitor is something to look forward to, some use cases do not even need to perform transformations. For instance, the es6 arrow function transform, needs to check if this is somewhere inside the function body. To perform that traversal there is no need to use the fully featured visitor, instead the simple and fast NodeVisitor can be used.

Note that there is no support for context or replacement in this visitor, it's intended to be used primarily for look ups.

Here are the results from the test/perf.js file:

PathVisitor took 144ms (name count: 6295)
NodeVisitor took 17ms (name count: 6295)

The usage is pretty simple, API compatible with PathVisitor but working with plain nodes instead of NodePath. Just import the visitor and feed it an actual node:

var NodeVisitor = require('ast-types').NodeVisitor;

NodeVisitor.visit(ast, {
  visitIdentifier: function (node) {
    console.log(node.type, node.name);
  }
});

I'll include some unit tests if you give the go ahead to including this :)

  • [ ]  General Comment
  • <img border=0 src='https://avatars.githubusercontent.com/u/1377?v=3' height=16 width=16'> @benjamn could some feedback be givin?
  • <img border=0 src='https://avatars.githubusercontent.com/u/5750?v=3' height=16 width=16'> This is really promising and makes a ton of sense (of course there should be a NodeVisitor and not just a PathVisitor… the opportunity to speed up simple transforms is so obvious in hindsight). I haven't had a chance to review the changes in detail, but you have my full support with this. I hope to be able to give this more attention Saturday, when my parents leave New York :) Is there anything I can do in the meantime to facilitate progress here?
  • <img border=0 src='https://avatars.githubusercontent.com/u/110784?v=3' height=16 width=16'> Thanks for the review, I've been busy on other projects and almost forgot about this one :) I'll try to find time to update the PR with a really fast NodeVisitor that supports replacements and NodePath navigation/transformations (but not synched). It's based on the idea we discussed about using a simple stack to hold the current state of the traversal. The problem is that I started to get crazy with dynamic compilation of the visiting algorithm, using optimized code generation to build switch cases (new Function) and saw quite an improvement there, so a simple task turned into a much harder one :) Anyway, I'll update this one in the meantime and we can explore the code generation approach at a later point, since it'll require some changes in the type definitions data structure (right now the ORs capture the definition in a closure).
  • <img border=0 src='https://avatars.githubusercontent.com/u/5750?v=3' height=16 width=16'> I'm wondering about the appropriate interface for exposing these two different kinds of visitors, and it occurs to me that we could just use PathVisitor if you pass a NodePath to require("ast-types").visit, and NodeVisitor if you pass a Node. That would be a breaking API change for sure, but it would encourage better practices in both cases: client code should really be creating its own NodePath objects to take advantage of caching to reduce NodePath allocation, and obviously using a plain NodeVisitor will be much faster if you can get away with it. You could imagine esnext being implemented with a mixture of transforms that expect either a NodePath or a Node. It could get tricky if NodeVisitors are able to modify the AST without notifying existing NodePath objects, but we do check values as we traverse with NodePath information (see here), so that might not actually be such a problem.
  • <img border=0 src='https://avatars.githubusercontent.com/u/110784?v=3' height=16 width=16'> Cool, I like the idea of using the type of the argument to choose the visitor. From what I've seen in the code the NodePath cache should detect any changes in the structure so I'm with you on it probably not being a problem. The only issue I see with the current visit(root, visitor) API is that since it allows to pass an object literal it's very tempting to do so, although it's a performance hit, specially if we introduce dynamic compilation. Perhaps that call should only accept instances of a Visitor, so the call to the factory would be explicit. Another option would be to detect support for ES6's WeakMap to memoize the visitor factory (or find a simple/fast way to emulate the weakmap behaviour for this use case).
  • [ ]  File: lib/path-visitor.js:L33-45
  • <img border=0 src='https://avatars.githubusercontent.com/u/5750?v=3' height=16 width=16'> I don't want this code to get lost if .reset is overridden in a naive way (without calling the original method). But I can appreciate that PathVisitor is now a subclass of Visitor, which defines the original empty stub, so maybe this is the cleanest way to handle resetting this._changeReported (and subclasses just have to be careful).
  • [ ]  File: lib/node-visitor.js:L1-57
  • <img border=0 src='https://avatars.githubusercontent.com/u/5750?v=3' height=16 width=16'> The return-to-replace use case might not be all that hard to support efficiently, though I don't mind suggesting PathVisitor instead.

  

drslump avatar Dec 14 '14 22:12 drslump

@benjamn could some feedback be givin?

stefanpenner avatar Dec 21 '14 15:12 stefanpenner

This is really promising and makes a ton of sense (of course there should be a NodeVisitor and not just a PathVisitor… the opportunity to speed up simple transforms is so obvious in hindsight).

I haven't had a chance to review the changes in detail, but you have my full support with this. I hope to be able to give this more attention Saturday, when my parents leave New York :)

Is there anything I can do in the meantime to facilitate progress here?

benjamn avatar Dec 23 '14 23:12 benjamn

Thanks for the review, I've been busy on other projects and almost forgot about this one :)

I'll try to find time to update the PR with a really fast NodeVisitor that supports replacements and NodePath navigation/transformations (but not synched). It's based on the idea we discussed about using a simple stack to hold the current state of the traversal.

The problem is that I started to get crazy with dynamic compilation of the visiting algorithm, using optimized code generation to build switch cases (new Function) and saw quite an improvement there, so a simple task turned into a much harder one :)

Anyway, I'll update this one in the meantime and we can explore the code generation approach at a later point, since it'll require some changes in the type definitions data structure (right now the ORs capture the definition in a closure).

drslump avatar Dec 24 '14 00:12 drslump

I'm wondering about the appropriate interface for exposing these two different kinds of visitors, and it occurs to me that we could just use PathVisitor if you pass a NodePath to require("ast-types").visit, and NodeVisitor if you pass a Node.

That would be a breaking API change for sure, but it would encourage better practices in both cases: client code should really be creating its own NodePath objects to take advantage of caching to reduce NodePath allocation, and obviously using a plain NodeVisitor will be much faster if you can get away with it.

You could imagine esnext being implemented with a mixture of transforms that expect either a NodePath or a Node. It could get tricky if NodeVisitors are able to modify the AST without notifying existing NodePath objects, but we do check values as we traverse with NodePath information (see here), so that might not actually be such a problem.

benjamn avatar Dec 24 '14 00:12 benjamn

Cool, I like the idea of using the type of the argument to choose the visitor. From what I've seen in the code the NodePath cache should detect any changes in the structure so I'm with you on it probably not being a problem.

The only issue I see with the current visit(root, visitor) API is that since it allows to pass an object literal it's very tempting to do so, although it's a performance hit, specially if we introduce dynamic compilation. Perhaps that call should only accept instances of a Visitor, so the call to the factory would be explicit. Another option would be to detect support for ES6's WeakMap to memoize the visitor factory (or find a simple/fast way to emulate the weakmap behaviour for this use case).

drslump avatar Dec 24 '14 00:12 drslump