nvim-treesitter icon indicating copy to clipboard operation
nvim-treesitter copied to clipboard

Indent logic migration

Open ribru17 opened this issue 6 months ago • 11 comments

We would like to migrate away from the current indents module and move towards a better one, like helix's or zed's. This issue will track research to see which of the two would be better to choose.

Zed

Pros

  • Very simple queries
  • Seems to solve a lot of issues with our indents
  • They are a funded project and will hopefully have fast query-level indent fixes that we can easily yoink

Cons

  • Need to port ~400 lines of rust code
  • Python, bash, and yaml have special regexes which help improve indents: https://github.com/zed-industries/zed/blob/9c11d24887a4d001a5fcf46911b269ea3abf93d7/crates/languages/src/python/config.toml#L31. Would we be able to emulate this as well? (I think the answer is yes...)

Issues which will be solved with this indent style

  • https://github.com/nvim-treesitter/nvim-treesitter/issues/7234
  • https://github.com/nvim-treesitter/nvim-treesitter/issues/7347
  • https://github.com/nvim-treesitter/nvim-treesitter/issues/7317
  • https://github.com/nvim-treesitter/nvim-treesitter/issues/7432
  • https://github.com/nvim-treesitter/nvim-treesitter/issues/7385
  • https://github.com/nvim-treesitter/nvim-treesitter/issues/7187
  • https://github.com/nvim-treesitter/nvim-treesitter/issues/7010

(and probably many more but i am now bored of checking)

Issues which will be introduced with this indent style

  • https://github.com/zed-industries/zed/issues/13924
  • https://github.com/zed-industries/zed/issues/25584
    • Though we also have a problem with this, our indents look like this:
      function test() {
        const a = () =>
      {
          let b = a
        }
      }
      

Helix

Pros

  • Queries are very well documented
  • Also fixes a lot of our indentation issues*

Cons

  • Queries aren't that simple
  • Also need to port I would guess 500+ lines of rust code (https://github.com/helix-editor/helix/blob/12139a4c30ad20d9a1b181de69532a57601cf96f/helix-core/src/indent.rs#L18, note the file is over 1000 LOC but some of that is tests or other helix-specific stuff)
  • *Helix does not have a way to correct indentation for multiple lines, like vim's = (or zed's vim mode =). Thus it only computes indent diffs between current and new lines, and applies indentation accordingly. As a result, it seems to fix a lot of our indentation issues, but I worry that it doesn't cover indentation of multiple lines nicely (because it doesn't need to worry about that use case)

Plan

  1. Add new (assuming for the sake of example Zed) indentation logic as new zindents.lua, which uses zindents.scm queries.
  2. Expose a separate require'nvim-treesitter'.zindentexpr() (or _indentexpr) to opt in to the new behavior (where new queries exist).
  3. Once the new logic is stable and enough new queries are available, rename indent* to oldindent* and zindent* to zindent.
  4. In the next release, remove oldindent*.

ribru17 avatar May 03 '25 21:05 ribru17

Need to port ~400 lines of rust code

That is not infeasible; the big question is whether that code can be made efficiently to work through Vim's indentexpr API (which just takes a line number and returns the indent value), or whether it needs deeper integration (which may also be possible but would require doing this in core directly).

Queries aren't that simple

Pro: Queries are very well documented ;)

clason avatar May 04 '25 08:05 clason

After looking through the code I think this definitely can be done with our indentexpr

ribru17 avatar May 04 '25 21:05 ribru17

Python, bash, and yaml have special regexes which help improve indents: https://github.com/zed-industries/zed/blob/9c11d24887a4d001a5fcf46911b269ea3abf93d7/crates/languages/src/python/config.toml#L31. Would we be able to emulate this as well? (I think the answer is yes...)

Actually, that is kind of a dealbreaker (and cheating 😠 ). At that point, you're not doing tree-sitter indentation but a hybrid scheme -- which brings us back to Nvim's regex based syntax.

(This is only three languages, but Zed doesn't support that many to begin with; for us, the total number that require this may end up being significant.)

It would be better if this could be tackled at the level of queries (with additional captures, or a special directive).

clason avatar May 16 '25 15:05 clason

Actually, that is kind of a dealbreaker

Why? I think the issue with relying on queries fully is that as you type, the tree is not always value because you are in the middle of editing, but you'd still want proper indents as you type, ofc. I think this could be avoided if ts error recovery was smarter but it has a lot of shortcomings now.

ribru17 avatar May 16 '25 15:05 ribru17

Then it might be the wrong tool for the job, and we should just stick with Vimdents.

Unlike Zed, we already have a working indentation system, so adding tree-sitter indentation is only worth it if there's a significant benefit. At the highest level, this would be no longer having to rely on custom hacks and having a fully declarative query for every language.

clason avatar May 16 '25 15:05 clason

I would say there is significant benefit; with vimdents you got a ton of weird cases e.g. where { in a string causes an indent. TS eliminates this class of issues, and we should allow ourselves some slack to make it even better for "during-typing" indents

ribru17 avatar May 16 '25 15:05 ribru17

That said, we could probably move this special hacks into queries with #lua-match? (and a thorough test that the query always captures those weird cases)

ribru17 avatar May 16 '25 15:05 ribru17

I'm pragmatic about this, but I do not want language-specific "hacks" in the generic indent.lua module. Hence my comment about handling this with special captures or predicates (or metadata). It's about clear separation of code from data.

clason avatar May 16 '25 15:05 clason

Does my last comment satisfy that?

ribru17 avatar May 16 '25 16:05 ribru17

Of course, since it doesn't even require special captures or predicates.

clason avatar May 16 '25 16:05 clason

If zed indent logic is adapted to nvim-treesitter, would it address https://github.com/nvim-treesitter/nvim-treesitter/issues/6486 too? I think the current indent implementation goes against of async parsing?

lopi-py avatar May 26 '25 07:05 lopi-py

It wouldn't be the same as #6486 but I think it would be much more performant than the current indentexpr, so it would basically close that issue. Async parsing is a separate thing, though. Zed's indentexpr is async from what I can tell but I don't know how easily this can be done in vim's side

ribru17 avatar Jul 23 '25 01:07 ribru17