helix
helix copied to clipboard
LSP syntax highlighting
This will likely require us to refactor the current syntax highlighting system to accommodate other syntax highlighting methods.
I'm not a big fan of the LSP highlighting spec. It requires a lot of back and forth traffic with the LSP to do highlighting in real time. VSCode should adapt tree-sitter already..
I plan to make it optional in config.toml
or languages.toml
. There are some cases in which tree-sitter isn't enough.
tree-sitter is limited such that it does not have context of the language, like similar returns highlight or similar .await highlights.
Also, for Rust, unsafe
highlighting is nice in VSCode and what I miss the most in helix
I'm not a big fan of the LSP highlighting spec. It requires a lot of back and forth traffic with the LSP to do highlighting in real time. VSCode should adapt tree-sitter already..
Does this mean this won't ever be a first class feature of helix? I'm not really fussed either way. I only ask because I'm working on a tool to convert themes between several text editors formats and this would factor into the implementation longer term.
While LSP is heavy, tree-sitter does not provide comparable highlighting. One solution could be to have LSP as an opt-in, with tree-sitter as the default.
As petty as it sounds, the lack of semantic highlighting is one of the very few things that hold me from daily driving Helix. Being able to jump in and see the same theming quality as I'm used to is a must for me, and I imagine other prospective users as well.
@LouisGariepy IIRC in the release notes I remember there is space-h
select the same symbols in the function, that way it can be similar like part of the syntax highlight feature.
Could a maintainer point out the conditions for an acceptable solution then? If you simply don't want this feature in Helix (which is totally fair!), maybe you could close or mark as "wontfix" to avoid confusion.
To be clear, I'm asking from the perspective of someone who'd like to contribute to this effort. I just don't want to waste my (and your) time with a feature that will never be merged.
@LouisGariepy
Being able to jump in and see the same theming quality as I'm used to is a must for me, and I imagine other prospective users as well.
I'd be interested to see a side by side comparison with the theme/languages you use if you don't mind? In my experience, it's the opposite. I quite often find tree sitter highlighting being better than the alternatives.
I'd be interested to see a side by side comparison with the theme/languages you use if you don't mind?
i miss mutable variables (in rust) with bold
modifier.
That's doable with a custom locals.scm
query.
That's doable with a custom
locals.scm
query.
Is it though ? They're talking about highlighting the variable as mutable each time it's used, not just at the definition point.
I honestly don't know if this is possible, if it is it seems like a quick win to add it to helix
Yeah, that's doable with a locals.scm
. It will track the use of an identifier throughout the scope and apply the color it was highlighted with at the definition.
Edit for future readers: Please read archseer's response to my comment, which gives a more correct explanation of what is and what is not possible to do with tree-sitter Vs LSP highlighting. My comment contains some inaccuracies, but I'll leave it as is for posterity.
@JakeHL
Alright! I jotted down some basic code to see the potential differences. I tried to emphasize differences due to "limitations", not stylistic ones. There might be more that I'm missing for complex code bases, this is just basic stuff.
Here goes:
Public items are distinct from private ones.
Note that this holds for any item that can be made public, not just functions.
Mutable variables and operators are distinct from immutable ones.
Mutable functions/method calls are likewise distinct.
Variables in format strings are recognized as such (distinct from the rest of the string).
Doc comments are distinct from "regular" comments.
Might be hard to see (stylistic choice), but the doc comments are lighter in color.
Generic types are distinct from their trait bounds.
Macro rules binding arguments are distinct from constant patterns.
$e
and $es
are distinct in color compared to the rest of the macro_rules syntax.
Attribute syntax is recognized properly.
Imports (use
s) are properly highlighted according to their types.
Trait methods are distinct from non-trait methods.
Notice how the trait method is italicized.
Enum variants are distinct from structs and function calls
Conclusion
While some of these features might be possible to implement in tree-sitter, I'd like for us to think of the practicality of doing so. I have put many hours into trying to make my "perfect" tree-sitter theme, and I had to give up because I simply could not find a decent/possible way to make some features work. I had a hard time finding documentation and examples of advanced features.
For end users, this results in two things:
- Being unable to use the plethora of LSP themes that already exist.
- The themes that do exist in tree-sitter form do not have as extensive highlighting capabilities.
Alternatively, if some of these features are known to be possible in tree-sitter, maybe they should be implemented in one of the default themes, so that users can reuse the same logic without having to reinvent the wheel.
Most of these are currently achievable by modifying the queries except for imports and accurate type highlighting. With https://github.com/helix-editor/helix/issues/1252 however, I think that would be addressed as well.
There are certainly limitations to tree-sitter, such as not being able to highlight markdown in doc comments, but maybe that's something that could be worked on in the future. Language-specific features such as trait methods probably do need LSP highlighting.
It looks like you deleted your last comment, but I think you have a misunderstanding of how tree-sitter works. From how I understand it, tree-sitter a parser generator with an AST geared towards syntax highlighting. However, you also need to write queries to capture AST nodes such that they can be highlighting in themes.
You can check the tree-sitter playground to see some of what these ASTs look like.
See: https://tree-sitter.github.io/tree-sitter/syntax-highlighting#queries https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries
@LouisGariepy Thanks for that rundown. I think you've demonstrated why it would be a useful feature.
I do think that the main thing it gives us that tree sitter doesn't is type related highlighting. I'm pretty sure with enough time you could create tree-sitter queries for anything that is purely syntax focused.
Here's an example of doc comments & string interpolation already working in Typescript in helix. I might be using a modified Everforest theme, I can't remember.
Yeah, tree-sitter grammars are parsers that build an in-memory AST of the file. Then the highlighting queries are like selectors that tag specific parts of the tree with span names, and themes can hook into these spans to highlight them as they wish.
Public items are distinct from private ones.
This is as simple as adding a highlighting query that themes can hook into.
Mutable variables and operators are distinct from immutable ones.
Can be done with a locals query.
Variables in format strings are recognized as such (distinct from the rest of the string).
Probably needs a change in the parser, but it could be injected into such strings and the embedded parts interpolated. It works in other languages.
Doc comments are distinct from "regular" comments.
This is something I've been waiting for https://github.com/tree-sitter/tree-sitter-rust/pull/128
Generic types are distinct from their trait bounds.
This is as simple as adding a highlighting query that themes can hook into.
Macro rules binding arguments are distinct from constant patterns.
Not sure yet since macros produce a token tree but probably possible.
Attribute syntax is recognized properly.
Fixed in https://github.com/helix-editor/helix/commit/301f5d7cf704c2f2e4fc53e258125e39d1845176
Imports (uses) are properly highlighted according to their types. Trait methods are distinct from non-trait methods.
These two are to do so yeah, it could be LSP specific.
Enum variants are distinct from structs and function calls
Probably possible with locals.
I wish LSP provided these highlights as an augmented layer on top of regular highlighting instead of completely overriding highlighting and churning out tons and tons of spans per keypress.
Anyway, for this to be implemented, a vec of produced spans would have to sit somewhere on the document, that would then convert to a HighlightIter
and hook into our regular rendering mechanism without changes.
I do think that the main thing it gives us that tree sitter doesn't is type related highlighting.
I think in combination with stack graphs we wouldn't just get go to definition support but also a way to determine the item type since we can peek at the definition
A discussion on how LSP syntax highlighting can be gracefully implemented was discussed on Matrix:
Tree-sitter will be the default highlighter, while range-based LSP syntax highlighting will be merged asynchronously. Additionally, we would compute the correct highlights for locals based on what the LSP syntax highlighter sends.