tree-sitter-cpp icon indicating copy to clipboard operation
tree-sitter-cpp copied to clipboard

No distinction between a function declaration and a variable declaration

Open MaxVerevkin opened this issue 4 years ago • 5 comments
trafficstars

The tree-sitter parser makes no distinction between a function declaration and this constructor call. It even thinks that x or n in my example are types. So this is probably a tree-sitter-cpp issue.

image

Originally posted by @theHamsta in https://github.com/nvim-treesitter/nvim-treesitter/issues/1625#issuecomment-886098101

MaxVerevkin avatar Jul 24 '21 19:07 MaxVerevkin

https://godbolt.org/z/PxfYhdEYn

parsing variable and function declarations is highly context dependent. When you see an identifier it could be a variable, function or type, and that affects the parsing of the code.

In your example (demonstrated in the godbolt link), if n were a type it really would be a function declaration. To correctly parse the code you also have to keep track of all types and variables declared, including includes so there seems to be no way for tree-sitter to get this correct.

The last example is known as the most vexing parse, where even though n is a value in scope, the int(n) in the parameters gets parsed as a function parameter, and that line is a function declaration again. I believe the rule (of thumb?) is that if it could be a function declaration then it is. So perhaps thats why tree-sitter-cpp is parsing your example as a function (although that rule doesn't actually apply here)

So there doesn't seem to be a way to get it right 100%, but perhaps precedence could be changed to make it parse as a variable declaration in cases like this. local function declarations are exceedingly rare anyway

IndianBoy42 avatar Jul 25 '21 03:07 IndianBoy42

Okay, I see. So, I suppose there are two solutions?

  • vector<int> vec(n); is a variable declaration if n is a (locally) known identifier. I'm not familiar with tree-sitter enough to tell whether it's possible.
  • vector<int> vec(n); is a function declaration if it's global and a variable declaration otherwise. Local variables are much more common than local functions declarations.

MaxVerevkin avatar Jul 25 '21 06:07 MaxVerevkin

@IndianBoy42 is absolutely right. We could try to solve this by trying to take the context into account in our queries (not guess function declaration within function bodies) or just leave it the way it is. I guess even with locals this will be almost impossible to distinguish

theHamsta avatar Jul 25 '21 08:07 theHamsta

It is actually undecidable to solve this problem, as described in this blog post. Not only would the parser have to keep track of existing (local and global) variables, but it would also have to perform template instantiation (i. e. arbitrary computation).

narpfel avatar Aug 03 '21 22:08 narpfel

Any updates on this? I would much rather prefer treesitter to parse them as variables, because when I'm using treesitter textobjects, go to next function makes me land on the variable declaration. And also, who would ever write a local function declaration 😭.

williamhCode avatar Nov 17 '23 06:11 williamhCode