nvim-treesitter
nvim-treesitter copied to clipboard
Indent logic migration
We would like to migrate away from the current indents module and move towards a better one, like helix's or zed's. This issue will track research to see which of the two would be better to choose.
Zed
Pros
- Very simple queries
- Seems to solve a lot of issues with our indents
- They are a funded project and will hopefully have fast query-level indent fixes that we can easily yoink
Cons
- Need to port ~400 lines of rust code
- Python, bash, and yaml have special regexes which help improve indents: https://github.com/zed-industries/zed/blob/9c11d24887a4d001a5fcf46911b269ea3abf93d7/crates/languages/src/python/config.toml#L31. Would we be able to emulate this as well? (I think the answer is yes...)
Issues which will be solved with this indent style
- https://github.com/nvim-treesitter/nvim-treesitter/issues/7234
- https://github.com/nvim-treesitter/nvim-treesitter/issues/7347
- https://github.com/nvim-treesitter/nvim-treesitter/issues/7317
- https://github.com/nvim-treesitter/nvim-treesitter/issues/7432
- https://github.com/nvim-treesitter/nvim-treesitter/issues/7385
- https://github.com/nvim-treesitter/nvim-treesitter/issues/7187
- https://github.com/nvim-treesitter/nvim-treesitter/issues/7010
(and probably many more but i am now bored of checking)
Issues which will be introduced with this indent style
- https://github.com/zed-industries/zed/issues/13924
- https://github.com/zed-industries/zed/issues/25584
- Though we also have a problem with this, our indents look like this:
function test() { const a = () => { let b = a } }
- Though we also have a problem with this, our indents look like this:
Helix
Pros
- Queries are very well documented
- Also fixes a lot of our indentation issues*
Cons
- Queries aren't that simple
- Also need to port I would guess 500+ lines of rust code (https://github.com/helix-editor/helix/blob/12139a4c30ad20d9a1b181de69532a57601cf96f/helix-core/src/indent.rs#L18, note the file is over 1000 LOC but some of that is tests or other helix-specific stuff)
- *Helix does not have a way to correct indentation for multiple lines, like vim's
=(or zed's vim mode=). Thus it only computes indent diffs between current and new lines, and applies indentation accordingly. As a result, it seems to fix a lot of our indentation issues, but I worry that it doesn't cover indentation of multiple lines nicely (because it doesn't need to worry about that use case)
Plan
- Add new (assuming for the sake of example Zed) indentation logic as new
zindents.lua, which useszindents.scmqueries. - Expose a separate
require'nvim-treesitter'.zindentexpr()(or_indentexpr) to opt in to the new behavior (where new queries exist). - Once the new logic is stable and enough new queries are available, rename
indent*tooldindent*andzindent*tozindent. - In the next release, remove
oldindent*.
Need to port ~400 lines of rust code
That is not infeasible; the big question is whether that code can be made efficiently to work through Vim's indentexpr API (which just takes a line number and returns the indent value), or whether it needs deeper integration (which may also be possible but would require doing this in core directly).
Queries aren't that simple
Pro: Queries are very well documented ;)
After looking through the code I think this definitely can be done with our indentexpr
Python, bash, and yaml have special regexes which help improve indents: https://github.com/zed-industries/zed/blob/9c11d24887a4d001a5fcf46911b269ea3abf93d7/crates/languages/src/python/config.toml#L31. Would we be able to emulate this as well? (I think the answer is yes...)
Actually, that is kind of a dealbreaker (and cheating 😠 ). At that point, you're not doing tree-sitter indentation but a hybrid scheme -- which brings us back to Nvim's regex based syntax.
(This is only three languages, but Zed doesn't support that many to begin with; for us, the total number that require this may end up being significant.)
It would be better if this could be tackled at the level of queries (with additional captures, or a special directive).
Actually, that is kind of a dealbreaker
Why? I think the issue with relying on queries fully is that as you type, the tree is not always value because you are in the middle of editing, but you'd still want proper indents as you type, ofc. I think this could be avoided if ts error recovery was smarter but it has a lot of shortcomings now.
Then it might be the wrong tool for the job, and we should just stick with Vimdents.
Unlike Zed, we already have a working indentation system, so adding tree-sitter indentation is only worth it if there's a significant benefit. At the highest level, this would be no longer having to rely on custom hacks and having a fully declarative query for every language.
I would say there is significant benefit; with vimdents you got a ton of weird cases e.g. where { in a string causes an indent. TS eliminates this class of issues, and we should allow ourselves some slack to make it even better for "during-typing" indents
That said, we could probably move this special hacks into queries with #lua-match? (and a thorough test that the query always captures those weird cases)
I'm pragmatic about this, but I do not want language-specific "hacks" in the generic indent.lua module. Hence my comment about handling this with special captures or predicates (or metadata). It's about clear separation of code from data.
Does my last comment satisfy that?
Of course, since it doesn't even require special captures or predicates.
If zed indent logic is adapted to nvim-treesitter, would it address https://github.com/nvim-treesitter/nvim-treesitter/issues/6486 too? I think the current indent implementation goes against of async parsing?
It wouldn't be the same as #6486 but I think it would be much more performant than the current indentexpr, so it would basically close that issue. Async parsing is a separate thing, though. Zed's indentexpr is async from what I can tell but I don't know how easily this can be done in vim's side