tree-sitter-julia
tree-sitter-julia copied to clipboard
Question: expand `for` and `struct` grammar with headers?
Did you check existing issues?
- [X] I have read all the tree-sitter docs if it relates to using the parser
- [X] I have searched the existing issues of tree-sitter-julia
Tree-Sitter CLI Version, if relevant (output of tree-sitter --version)
No response
Describe the bug
This is not a bug, but more like a question/feature request.
I'm trying to update / fix julia queries for neovim and I'm having a very hard time with for loops and also structs.
This python example
for a in range(10):
pass
is parsed as follows:
(module ; [0, 0] - [2, 0]
(for_statement ; [0, 0] - [1, 8]
left: (identifier) ; [0, 4] - [0, 5]
right: (call ; [0, 9] - [0, 18]
function: (identifier) ; [0, 9] - [0, 14]
arguments: (argument_list ; [0, 14] - [0, 18]
(integer))) ; [0, 15] - [0, 17]
body: (block ; [1, 4] - [1, 8]
(pass_statement)))) ; [1, 4] - [1, 8]
and this julia example
for a in 1:10, b in 1:10
print(a)
end
is parsed as
(source_file ; [0, 0] - [3, 0]
(for_statement ; [0, 0] - [2, 3]
(for_binding ; [0, 4] - [0, 13]
(identifier) ; [0, 4] - [0, 5]
(range_expression ; [0, 9] - [0, 13]
(integer_literal) ; [0, 9] - [0, 10]
(integer_literal))) ; [0, 11] - [0, 13]
(for_binding ; [0, 15] - [0, 24]
(identifier) ; [0, 15] - [0, 16]
(range_expression ; [0, 20] - [0, 24]
(integer_literal) ; [0, 20] - [0, 21]
(integer_literal))) ; [0, 22] - [0, 24]
(call_expression ; [1, 4] - [1, 12]
(identifier) ; [1, 4] - [1, 9]
(argument_list ; [1, 9] - [1, 12]
(identifier))))) ; [1, 10] - [1, 11]
Because the two for_binding nodes are not grouped together in any way and are siblings of the call_expression, I couldn't write any query that would correctly select the loop "header" (regardless of the number of variables iterated over), and neither any query that would select the body without the "header". This might be due to the fact that I'm no expert in TS queries, but for Python such queries are really simple.
Similar situation is with struct definitions:
struct A{B, C} <: D
x
y
end
is parsed as
(source_file ; [0, 0] - [4, 0]
(struct_definition ; [0, 0] - [3, 3]
name: (identifier) ; [0, 7] - [0, 8]
(type_parameter_list ; [0, 8] - [0, 14]
(identifier) ; [0, 9] - [0, 10]
(identifier)) ; [0, 12] - [0, 13]
(type_clause ; [0, 15] - [0, 19]
(operator) ; [0, 15] - [0, 17]
(identifier)) ; [0, 18] - [0, 19]
(identifier) ; [1, 4] - [1, 5]
(identifier))) ; [2, 4] - [2, 5]
Again, struct header nodes type_parameter_list and type_clause are siblings of the struct body.
Is there a reason not to group struct and loop "headers" together similarly to how python is parsed?
Ifs in python also provide consequence child:
if True:
pass
elif False:
pass
else:
pass
(module ; [0, 0] - [6, 0]
(if_statement ; [0, 0] - [5, 8]
condition: (true) ; [0, 3] - [0, 7]
consequence: (block ; [1, 4] - [1, 8]
(pass_statement)) ; [1, 4] - [1, 8]
alternative: (elif_clause ; [2, 0] - [3, 8]
condition: (false) ; [2, 5] - [2, 10]
consequence: (block ; [3, 4] - [3, 8]
(pass_statement))) ; [3, 4] - [3, 8]
alternative: (else_clause ; [4, 0] - [5, 8]
body: (block ; [5, 4] - [5, 8]
(pass_statement))))) ; [5, 4] - [5, 8]
whereas in julia all "consequence" lines are siblings of the condition:
if true
1
1
elseif false
1
else
1
end
(source_file ; [0, 0] - [8, 0]
(if_statement ; [0, 0] - [7, 3]
condition: (boolean_literal) ; [0, 3] - [0, 7]
(integer_literal) ; [1, 4] - [1, 5]
(integer_literal) ; [2, 4] - [2, 5]
alternative: (elseif_clause ; [3, 0] - [5, 0]
condition: (boolean_literal) ; [3, 7] - [3, 12]
(integer_literal)) ; [4, 4] - [4, 5]
alternative: (else_clause ; [5, 0] - [7, 0]
(integer_literal)))) ; [6, 4] - [6, 5]
There's two seperate issues here, so I'll address them separately.
Querying inner blocks
The block rule used in the grammar is not visible (see #73). There's no technical limitation here, but making it visible is a breaking change that would require updating almost all tests.
Querying "headers"
If blocks were visible, querying headers would be really simple, since they're always "the thing before the block".
For now, I can only think of a couple of workarounds:
ifandwhileconditions are a single expression, so this would work:(if_statement . (_) @condition)forandlethave their own rules for bindings, so this would work:(for_statement ((for_binding) ("," (for_binding))*) @bindings)
In the case of structs... The way they're currently parsed is awful. I took a much simpler approach for the lezer-julia grammar, and that should probably get ported here.
@savq thanks for the reply!
I prepared a PR https://github.com/nvim-treesitter/nvim-treesitter-textobjects/pull/639, any comments would be greatly appreciated!
The block rule used in the grammar is not visible (see https://github.com/tree-sitter/tree-sitter-julia/issues/73). There's no technical limitation here, but making it visible is a breaking change that would require updating almost all tests.
Yes, this would really help a lot. For ifs, conditions are easy for example as they are under the condition field, but selecting blocks is more difficult (and would have to rely on the matching algorithm, as elseif is for example a sibling of all nodes in the block)
(for_statement ((for_binding) ("," (for_binding))*) @bindings)
I tested this and it selects only one for_binding at a time, not all of them