tree-sitter-dart icon indicating copy to clipboard operation
tree-sitter-dart copied to clipboard

Long parse time for `for_statement`

Open genesistms opened this issue 3 years ago • 10 comments

I've got a problem inside neovim that opening dart file took a very long time. After some debugging i found that trying to parse dart file with this query is the problem:

(for_statement (block))

But this one is ok:

(for_statement)

And it does not matter what is inside dart file, it can be empty.

> tree-sitter query bad.scm test.dart   
3.59s user 0.00s system 99% cpu 3.596 total
> tree-sitter query ok.scm test.dart  
0.00s user 0.00s system 95% cpu 0.007 total

tree-sitter 0.20.1

genesistms avatar Dec 07 '21 06:12 genesistms

@maxbrunsfeld we're experiencing a slow down in loading certain queries for various parsers with 0.20.1 vs 0.20.0. What would be the best steps to investigate what's going on?

theHamsta avatar Dec 07 '21 06:12 theHamsta

Does tree-sitter have some kind of debug flag to get any additional info?

genesistms avatar Dec 07 '21 07:12 genesistms

You can try to build release with debug symbols and run a sampling profiler on it. But it will only report on a per-function basis. You can use nvtx to annotate certain fractions (https://docs.nvidia.com/gameworks/content/gameworkslibrary/nvtx/nvidia_tools_extension_library_nvtx.htm) of the code when you profile with night-systems (Nvidia profiling tool, no Nvidia GPU required). I'm sure other profilers provide similar annotation tools.

theHamsta avatar Dec 07 '21 08:12 theHamsta

@maxxnino @connorlay this is the simplest reproducer for this that we had until now

theHamsta avatar Dec 07 '21 10:12 theHamsta

Thanks for the report; this is definitely a problem, and a good reproduction case.

I don't think the fix is very complex, but I may need a few days before I have the focused time to work on it.

maxbrunsfeld avatar Dec 07 '21 17:12 maxbrunsfeld

As a side question @theHamsta: Neovim should really only be loading these queries once per language - is that already the case, or is the query loading currently happening repeatedly?

maxbrunsfeld avatar Dec 07 '21 18:12 maxbrunsfeld

We discovered that the loading of the highlight/injection queries is indeed happening repeatedly (indent, folding and textobject queries are cached until invalidated by a file change and I thought that someone also fixed this in core). I wanted to fix this this weekend.

theHamsta avatar Dec 07 '21 18:12 theHamsta

In the past, when I built query for highlights. It got slower and slower when more node inside a node too. At that time, I used tree sitter 0.20.0. My parser use recursion alot too. Maybe it cause hanging in some machine. I just don't know 😩

maxxnino avatar Dec 08 '21 22:12 maxxnino

I think this may be fixed by the latest Tree-sitter commit: https://github.com/tree-sitter/tree-sitter/commit/25f64e1eb66bb1ab3eccd4f0b7da543005f3ba79. Could someone build Neovim with this commit of Tree-sitter and take it for a spin?

maxbrunsfeld avatar Dec 10 '21 06:12 maxbrunsfeld

Update

> tree-sitter query bad.scm test.dart   
0.53s user 0.00s system 99% cpu 0.536 total
> tree-sitter query ok.scm test.dart  
0.00s user 0.00s system 98% cpu 0.007 total

It is still almost a second on an empty file.

tree-sitter 0.20.6

genesistms avatar May 31 '22 04:05 genesistms

Note that there is some startup time, which is probably what you are seeing, It currently is able to query all of the flutter repo under the packages folder within 17 seconds, which I think is not bad considering how many files / tests / packages they have. Additionally I'm not seeing any differences in the time it takes for the 'good' query and 'bad' one. Both show only .01s for me.

Note that I'm working on reducing the number of conflicts in the grammar rules which should help performance; but it isn't exactly easy to find specific problems. Closing this for now, since I'm not seeing the reported difference on these two query files

TimWhiting avatar May 23 '23 03:05 TimWhiting