Rewrite indentation code
[!Tip] As of the latest change, indentation is 2–3x faster with the potential to surpass 10x with Vim9script and Lua.
This PR contains a rewrite of the entire Clojure indentation code that aims to:
- Improve the (currently abysmal) indent performance and unlock yet more improvements in future,
- #6
- #32
- #35
- Change the defaults to be more similar to the Clojure Style Guide, cljfmt, Emacs and VS Code defaults,
- #22
- Enable greater configuration of indent rules (such that various indentation styles are possible),
- #21
- Simplify the indentation code as much as possible (currently so complex, it is near impossible to work on),
- Improve accuracy of the indentation algorithms, eliminating most (if not all) previous indentation bugs,
- #34
- https://github.com/clojure-vim/clojure.vim/pull/31#issuecomment-2227410501
- Eliminate reliance on syntax highlighting to be compatible with alternate syntax highlighting systems (e.g. Tree-sitter),
- #33
- Create a rich indentation test suite,
- #1
- #39
- #26
- Make quality of life improvements (when the (Neo)Vim version supports them), such as preventing
=operator from altering the indentation of multi-line strings and setting the'lisp'option in Clojure buffers.- #40
To do
- [x] New indentation algorithm.
- [x] More indentation tests + test the different config options.
- [x] Run indentation tests on Vim and Neovim.
- [x] Update documentation (doc file and README).
- [x] Replace existing configuration options.
- [ ] Resolve most remaining TODOs.
- [ ] Create a GitHub issue for the remaining missing indent features:
- Indentation rules for reader conditionals (i.e.
#?@(...),#?(...)). - Complex indentation rules (e.g.
letfn,extend-protocol, etc.).
- Indentation rules for reader conditionals (i.e.
- [ ] Maybe write blog post about this update?
I've been experimenting with a custom algorithm as per the check list item above. This version eliminates all syntax highlight checks (and therefore no longer requires the various hacks to make those checks work) plus it is approx the same speed as my first rewrite attempt and uses about the same amount of code.
However the core benefits of this version is that it works when syntax highlighting is switched off, and eliminates the performance bottle neck of syntax highlight checks. This means that it will be possible to write a version of the algorithm in Vim9 script, to get another significant performance improvement! (Maintaining both a legacy Vim script and Vim9 script versions should be feasible due to how much simpler this is than the original implementation.)
TL;DR: I'll push a new version to this branch soon which will be:
- Simpler with no hacks,
- Can be become even faster (via Vim9 script),
- Works even when syntax highlighting is switched off.
Once this branch is feature complete, I'll share benchmarks comparing it to the original as is on master.
Performance benchmarks as of current change (+ unpushed Vim9 version). This is measuring the time to reindent a whole file on a Macbook Pro 2021, so expect much more noticable performance gains on slower hardware. :rocket:
| Code version | 200 lines of Clojure | 600 lines of EDN |
|---|---|---|
| Original | 0.96s | 4.6s (clojure_maxlines = 0), 3.5s (clojure_maxlines = 300) |
| New (legacy) | 0.33s | 2.6s |
| New (Vim9) | 0.06s | 0.46s |
(I'll share new benchmarks (and the full reports generated by Vim) once this branch is feature complete, although I don't expect it to change much.)
Even with the Vim9 implementation in the code too (not pushed yet, still figuring out how best to integrate it in a maintainable way), we can expect the file size of indent/clojure.vim to be ~150 LOC less than the original!
Regarding Neovim, until Vim9 script support is added, the "New (legacy)" code will be used. If anyone would like to volunteer to write a Lua implementation of this code for Neovim please open an issue and we can discuss how to integrate it.
Any updates on this?
@NoahTheDuke yes, slow steady progress during the limited time I have to work on it at the moment. GitHub doesn't show it well, but there have been a bunch of bug fixes over the last few weeks. Currently my focus in on getting the majority of the tests to pass.
If you wanted to use it, this branch is actually usable if you don't mind a few minor formatting issues.
Cool. I mistakenly trusted Github's "added 11 commits 2 months ago", which it seems bundles recent stuff too. I don't mean to pressure, just saw a bunch of activity then looked like nothing. Glad to hear you're still poking at it. I'll give it a whirl locally!
Just tried it out and it's great so far. Thanks so much!
Update: nearing readiness! :tada:
This branch has now surpassed the original indentation code in pretty much every aspect (~2–3x faster—with potential to get ~10x faster with a little bit of Lua and Vim9 script—, less and simpler code, more accurate indentation, better defaults, more customisable, not dependent on syntax highlighting, etc.). There are a few small known bugs left to fix, the documentation needs updating and a little bit of code clean up will be required.
(The greatly expanded test suite is also already paying dividends; helping catch accidental regressions! More tests to come soon.)
Thanks for working on this, I've been trying clojure dev in neovim lately and was getting really frustrated by the fact that it would indent based on brackets in string literals. I just tried this branch and that problem is fixed.
Thanks for working on this, I've been trying clojure dev in neovim lately and was getting really frustrated by the fact that it would indent based on brackets in string literals. I just tried this branch and that problem is fixed.
@SavageMessiah great to hear that! If you encounter any issues on this branch please let me know.
Feedback:
(filter-test
a b c)
(filter-suite
a b c)
(letfn [(filter-test
[test-case]
...)
(filter-suite
[suite]
...)]
...)
It seems the matching algorithm for function indentation is based on any portion of the function's name, not the whole thing, so test is indented with 1 space and others are indented with 2. Regardless of my preference (1 vs 2), these should not be indented differently, especially when they're both user-defined functions.
Additionally, letfn functions incorrectly have hanging indentation, not flow indentation, as seen below. The func body should be indented 2 spaces, not even with the brackets.
(letfn [(func [test-case]
...)]
...)
Thanks so much!
Hi @NoahTheDuke, thanks for raising and testing!
It seems the matching algorithm for function indentation is based on any portion of the function's name, not the whole thing, so test is indented with 1 space and others are indented with 2. [...]
Unfortunately I'm not seeing the same behaviour you are with filter-test vs. filter-suite, perhaps you have some config options set controlling the indentation for those? If you do and can share it, I'll happily take a further look into this issue.
Additionally, letfn functions incorrectly have hanging indentation, not flow indentation [...]
I am aware of this current deficiency in the indentation algorithm. It is possible to solve, but it will be tricky as letfn falls into what I call complex indentation, which includes others such as extend-protocol, extend-type, and method bodies of defrecord, deftype and defprotocol. I call it "complex indentation" as they alter the indentation rules of the lists within them.
I will fix it, but it will likely be after this PR is merged as their use isn't particularly common and hopefully the other improvements I've made will make up for it being a temporary issue. As a partial workaround, I'd recommend using let and fn instead of letfn where possible, as it's often no more code and is (at least in my opinion) more readable too. Alternatively you can run cljfmt before committing or hook it into Vim with a pre-write autocommand.
Asides
Why is indenting letfn so difficult? (Click this to find out.)
The challenge with the complex indentation rules is in making it both efficient within Vim script, and generic/customisable.
Emacs' clojure-mode avoids the first (by Emacs Lisp being much, much faster than Vim script :sob:) and solves the latter nicely with a way of defining complex indentation rules (which cljfmt largely replicated). I may end up doing something similar, but do wonder if some flexibility could be sacrificed in favour of easier customisation and better performance...
(letfn is actually even worse than the other complex indentation rules as its rules are unique! In the old clojure.vim indentation code we had hardcoded rules just to handle it!)
Click this to read a complete tangent on my opinions about letfn overuse (that no one asked for). 🙂
Personally I feel that letfn is overused in Clojure code, as the only reason I can see to use it over plain let + fn is when the letfn'd functions need to call each other. If it were used only in that case, it could be useful as a signal that that is happening, much like how use of when over if signifies that there is no "else" branch.
(The equivalent of letfn in Scheme is letrec — i.e. let recursive — which I believe may have inspired letfn and kind of backs up my thoughts on its overuse.)