This will likely require us to refactor the current syntax highlighting system to accommodate other syntax highlighting methods.

Oct 05 '21 16:10 kirawi

I'm not a big fan of the LSP highlighting spec. It requires a lot of back and forth traffic with the LSP to do highlighting in real time. VSCode should adapt tree-sitter already..

Oct 17 '21 06:10 archseer

I plan to make it optional in config.toml or languages.toml. There are some cases in which tree-sitter isn't enough.

Oct 17 '21 14:10 kirawi

tree-sitter is limited such that it does not have context of the language, like similar returns highlight or similar .await highlights.

Oct 19 '21 13:10 pickfire

Also, for Rust, unsafe highlighting is nice in VSCode and what I miss the most in helix

Jun 27 '22 12:06 poliorcetics

I'm not a big fan of the LSP highlighting spec. It requires a lot of back and forth traffic with the LSP to do highlighting in real time. VSCode should adapt tree-sitter already..

Does this mean this won't ever be a first class feature of helix? I'm not really fussed either way. I only ask because I'm working on a tool to convert themes between several text editors formats and this would factor into the implementation longer term.

Jul 04 '22 13:07 jakenvac

While LSP is heavy, tree-sitter does not provide comparable highlighting. One solution could be to have LSP as an opt-in, with tree-sitter as the default.

As petty as it sounds, the lack of semantic highlighting is one of the very few things that hold me from daily driving Helix. Being able to jump in and see the same theming quality as I'm used to is a must for me, and I imagine other prospective users as well.

Aug 31 '22 22:08 LouisGariepy

@LouisGariepy IIRC in the release notes I remember there is space-h select the same symbols in the function, that way it can be similar like part of the syntax highlight feature.

Sep 01 '22 16:09 pickfire

Could a maintainer point out the conditions for an acceptable solution then? If you simply don't want this feature in Helix (which is totally fair!), maybe you could close or mark as "wontfix" to avoid confusion.

To be clear, I'm asking from the perspective of someone who'd like to contribute to this effort. I just don't want to waste my (and your) time with a feature that will never be merged.

Sep 01 '22 18:09 LouisGariepy

@LouisGariepy

Being able to jump in and see the same theming quality as I'm used to is a must for me, and I imagine other prospective users as well.

I'd be interested to see a side by side comparison with the theme/languages you use if you don't mind? In my experience, it's the opposite. I quite often find tree sitter highlighting being better than the alternatives.

Sep 06 '22 11:09 jakenvac

I'd be interested to see a side by side comparison with the theme/languages you use if you don't mind?

i miss mutable variables (in rust) with bold modifier.

Sep 06 '22 14:09 univerz

That's doable with a custom locals.scm query.

Sep 06 '22 14:09 archseer

That's doable with a custom locals.scm query.

Is it though ? They're talking about highlighting the variable as mutable each time it's used, not just at the definition point.

I honestly don't know if this is possible, if it is it seems like a quick win to add it to helix

Sep 06 '22 14:09 poliorcetics

Yeah, that's doable with a locals.scm. It will track the use of an identifier throughout the scope and apply the color it was highlighted with at the definition.

Sep 06 '22 14:09 kirawi

Edit for future readers: Please read archseer's response to my comment, which gives a more correct explanation of what is and what is not possible to do with tree-sitter Vs LSP highlighting. My comment contains some inaccuracies, but I'll leave it as is for posterity.

@JakeHL

Alright! I jotted down some basic code to see the potential differences. I tried to emphasize differences due to "limitations", not stylistic ones. There might be more that I'm missing for complex code bases, this is just basic stuff.

Here goes:

Public items are distinct from private ones.

Note that this holds for any item that can be made public, not just functions.

public_item

public_item_helix

Mutable variables and operators are distinct from immutable ones.

Mutable functions/method calls are likewise distinct.

mutable_variables

mutable_variables_helix

Variables in format strings are recognized as such (distinct from the rest of the string).

format_variables

format_variables_helix

Doc comments are distinct from "regular" comments.

Might be hard to see (stylistic choice), but the doc comments are lighter in color.

doc_comments

doc_comments_helix

Generic types are distinct from their trait bounds.

trait_bounds

trait_bounds_helix

Macro rules binding arguments are distinct from constant patterns.

$e and $es are distinct in color compared to the rest of the macro_rules syntax.

macro_rules

macro_rules_helix

Attribute syntax is recognized properly.

attributes

attributes_helix

Imports (`use`s) are properly highlighted according to their types.

imports

imports_helix

Trait methods are distinct from non-trait methods.

Notice how the trait method is italicized.

trait_methods

trait_methods_helix

Enum variants are distinct from structs and function calls

enum_variants

enum_variants_helix

Conclusion

While some of these features might be possible to implement in tree-sitter, I'd like for us to think of the practicality of doing so. I have put many hours into trying to make my "perfect" tree-sitter theme, and I had to give up because I simply could not find a decent/possible way to make some features work. I had a hard time finding documentation and examples of advanced features.

For end users, this results in two things:

Being unable to use the plethora of LSP themes that already exist.
The themes that do exist in tree-sitter form do not have as extensive highlighting capabilities.

Alternatively, if some of these features are known to be possible in tree-sitter, maybe they should be implemented in one of the default themes, so that users can reuse the same logic without having to reinvent the wheel.

Sep 06 '22 15:09 LouisGariepy

Most of these are currently achievable by modifying the queries except for imports and accurate type highlighting. With https://github.com/helix-editor/helix/issues/1252 however, I think that would be addressed as well.

There are certainly limitations to tree-sitter, such as not being able to highlight markdown in doc comments, but maybe that's something that could be worked on in the future. Language-specific features such as trait methods probably do need LSP highlighting.

Sep 06 '22 16:09 kirawi

It looks like you deleted your last comment, but I think you have a misunderstanding of how tree-sitter works. From how I understand it, tree-sitter a parser generator with an AST geared towards syntax highlighting. However, you also need to write queries to capture AST nodes such that they can be highlighting in themes.

You can check the tree-sitter playground to see some of what these ASTs look like.

See: https://tree-sitter.github.io/tree-sitter/syntax-highlighting#queries https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries

Sep 06 '22 20:09 kirawi

@LouisGariepy Thanks for that rundown. I think you've demonstrated why it would be a useful feature.

I do think that the main thing it gives us that tree sitter doesn't is type related highlighting. I'm pretty sure with enough time you could create tree-sitter queries for anything that is purely syntax focused.

Here's an example of doc comments & string interpolation already working in Typescript in helix. I might be using a modified Everforest theme, I can't remember.

Screen Shot 2022-09-07 at 11 10 59

Sep 07 '22 10:09 jakenvac

Yeah, tree-sitter grammars are parsers that build an in-memory AST of the file. Then the highlighting queries are like selectors that tag specific parts of the tree with span names, and themes can hook into these spans to highlight them as they wish.

Public items are distinct from private ones.

This is as simple as adding a highlighting query that themes can hook into.

Mutable variables and operators are distinct from immutable ones.

Can be done with a locals query.

Variables in format strings are recognized as such (distinct from the rest of the string).

Probably needs a change in the parser, but it could be injected into such strings and the embedded parts interpolated. It works in other languages.

Doc comments are distinct from "regular" comments.

This is something I've been waiting for https://github.com/tree-sitter/tree-sitter-rust/pull/128

Generic types are distinct from their trait bounds.

This is as simple as adding a highlighting query that themes can hook into.

Macro rules binding arguments are distinct from constant patterns.

Not sure yet since macros produce a token tree but probably possible.

Attribute syntax is recognized properly.

Fixed in https://github.com/helix-editor/helix/commit/301f5d7cf704c2f2e4fc53e258125e39d1845176

Imports (uses) are properly highlighted according to their types. Trait methods are distinct from non-trait methods.

These two are to do so yeah, it could be LSP specific.

Enum variants are distinct from structs and function calls

Probably possible with locals.

I wish LSP provided these highlights as an augmented layer on top of regular highlighting instead of completely overriding highlighting and churning out tons and tons of spans per keypress.

Anyway, for this to be implemented, a vec of produced spans would have to sit somewhere on the document, that would then convert to a HighlightIter and hook into our regular rendering mechanism without changes.

Sep 07 '22 10:09 archseer

I do think that the main thing it gives us that tree sitter doesn't is type related highlighting.

I think in combination with stack graphs we wouldn't just get go to definition support but also a way to determine the item type since we can peek at the definition

Sep 07 '22 10:09 archseer

A discussion on how LSP syntax highlighting can be gracefully implemented was discussed on Matrix:

Tree-sitter will be the default highlighter, while range-based LSP syntax highlighting will be merged asynchronously. Additionally, we would compute the correct highlights for locals based on what the LSP syntax highlighter sends.

Jan 03 '23 18:01 kirawi

helix helix copied to clipboard

LSP syntax highlighting

Public items are distinct from private ones.

Mutable variables and operators are distinct from immutable ones.

Variables in format strings are recognized as such (distinct from the rest of the string).

Doc comments are distinct from "regular" comments.

Generic types are distinct from their trait bounds.

Macro rules binding arguments are distinct from constant patterns.

Attribute syntax is recognized properly.

Imports (uses) are properly highlighted according to their types.

Trait methods are distinct from non-trait methods.

Enum variants are distinct from structs and function calls

Conclusion

helix
helix copied to clipboard

Imports (`use`s) are properly highlighted according to their types.