TreeSitter.jl icon indicating copy to clipboard operation
TreeSitter.jl copied to clipboard

Parametrize Nodes and add tree printing

Open kskyten opened this issue 4 years ago • 5 comments

This adds pretty printing from AbstractTrees and makes it possible to dispatch on nodes as they have distinct types now. Should we parametrize by the language as well? The tests break as they are currently dependent on the printing of the trees, but I wanted to get your thoughts first.

kskyten avatar Jan 07 '21 10:01 kskyten

Hi @kskyten thanks for the PR. I'm happy with leveraging AbstractTrees for printing.

Could you outline the use case for parameterising on node type? I assume it's for allowing multiple dispatch on nodes, eg

f(n::Node{:for}) = ...
f(n::Node{:while}) = ...
# etc...

rather than hard coding conditional branches? It would be more ergonomic perhaps, though may (possibly) cause some type-instability issues due to all the nodes having differing types and the potential associated runtime hit during traversal. Probably best to leave language parameter out of the equation for now until we are decided on whether node type parameterisation is the right thing to do.

(As you can probably tell by the lack of activity in this repo I've not had much time recently to work on it, but I'm fine with discussing things/answering questions about things.)

MichaelHatherly avatar Jan 07 '21 10:01 MichaelHatherly

Yes, parametrising is for multiple dispatch. I find multiple dispatch is a very nice way of dealing with ASTs.Good point about the type-instability. I hadn't thought of the performance implications. I should see how it is done in https://github.com/julia-vscode/CSTParser.jl.

kskyten avatar Jan 07 '21 11:01 kskyten

I hadn't thought of the performance implications. I should see how it is done in https://github.com/julia-vscode/CSTParser.jl.

I think CST used to use a EXPR{HEAD} implementation, or something similar. But from what's currently there it looks like there is no parameter now. Not 100% sure on its history though.

MichaelHatherly avatar Jan 07 '21 11:01 MichaelHatherly

It can have a noticeable impact on runtime if you're iterating over a collection of parameterised nodes where each parameter isn't statically known, which will result in dynamic dispatch, though sometimes if the parameter set is small and known Julia does some union splitting for us. I doubt that would be the case with Symbol parameters, but the only way to actually know is some benchmarking.

MichaelHatherly avatar Jan 07 '21 11:01 MichaelHatherly

For what it's worth, the parser which is now the default in Julia, JuliaSyntax.jl, does not parameterize nodes for performance reasons

From the documentation about the Kind struct (emphasis added)

Kind is a type tag for specifying the type of tokens and interior nodes of a syntax tree. Abstractly, this tag is used to define our own sum types for syntax tree nodes. We do this explicitly outside the Julia type system because (a) Julia doesn't have sum types and (b) we want concrete data structures which are unityped from the Julia compiler's point of view, for efficiency.

caleb-allen avatar Sep 25 '23 13:09 caleb-allen