tree-sitter-cpp
tree-sitter-cpp copied to clipboard
No distinction between a function declaration and a variable declaration
The tree-sitter parser makes no distinction between a function declaration and this constructor call.
It even thinks that x or n in my example are types. So this is probably a tree-sitter-cpp issue.

Originally posted by @theHamsta in https://github.com/nvim-treesitter/nvim-treesitter/issues/1625#issuecomment-886098101
https://godbolt.org/z/PxfYhdEYn
parsing variable and function declarations is highly context dependent. When you see an identifier it could be a variable, function or type, and that affects the parsing of the code.
In your example (demonstrated in the godbolt link), if n were a type it really would be a function declaration. To correctly parse the code you also have to keep track of all types and variables declared, including includes so there seems to be no way for tree-sitter to get this correct.
The last example is known as the most vexing parse, where even though n is a value in scope, the int(n) in the parameters gets parsed as a function parameter, and that line is a function declaration again. I believe the rule (of thumb?) is that if it could be a function declaration then it is. So perhaps thats why tree-sitter-cpp is parsing your example as a function (although that rule doesn't actually apply here)
So there doesn't seem to be a way to get it right 100%, but perhaps precedence could be changed to make it parse as a variable declaration in cases like this. local function declarations are exceedingly rare anyway
Okay, I see. So, I suppose there are two solutions?
vector<int> vec(n);is a variable declaration ifnis a (locally) known identifier. I'm not familiar with tree-sitter enough to tell whether it's possible.vector<int> vec(n);is a function declaration if it's global and a variable declaration otherwise. Local variables are much more common than local functions declarations.
@IndianBoy42 is absolutely right. We could try to solve this by trying to take the context into account in our queries (not guess function declaration within function bodies) or just leave it the way it is. I guess even with locals this will be almost impossible to distinguish
It is actually undecidable to solve this problem, as described in this blog post. Not only would the parser have to keep track of existing (local and global) variables, but it would also have to perform template instantiation (i. e. arbitrary computation).
Any updates on this? I would much rather prefer treesitter to parse them as variables, because when I'm using treesitter textobjects, go to next function makes me land on the variable declaration. And also, who would ever write a local function declaration 😭.