cursorless
cursorless copied to clipboard
Improve performance when using large files
When using large files. eg a json file with a 100 000 lines our scopes that looks at the entire file are quite slow. That two main cases for this is:
- Tree sitter scope handler
- Surrounding pair scope handler
The surrounding pair scope handler could be improved by parsing the file in chunks. Initially I been focused on debugging the Tree sitter scope handler and its matches function. https://github.com/cursorless-dev/cursorless/blob/49f70840f0b11d4fca1744f88af1cd3d91cbd3d5/packages/cursorless-engine/src/languages/TreeSitterQuery/TreeSitterQuery.ts#L69-L131
I added a time log between each step in the return statement. The below is the result for "take key" in a 26MB json file.
TreeSitterQuery.matches: query.matches: 4.579s
TreeSitterQuery.matches: map1: 3.303s
TreeSitterQuery.matches: filter: 160.087ms
TreeSitterQuery.matches: map2: 4.269s
TreeSitterQuery.matches: 12.313s
Tree sitter generateScopeCandidates: 12.917s
collectionKey getContainingScopeTarget: 12.985s
It's clear that the Tree sitter query match takes some time. This could be improved by not actually checking all patterns/captures and instead focusing on the one that's actually match to the requested scope type. It's also clear that our post processing in typescript is quite slow. I'm referring to the two map stages.