ocaml-lsp
ocaml-lsp copied to clipboard
Improve semantic tokens support
Bug fixes
- [ ] named arguments - occurs after renaming
~line
tolines
, ieF2
and type inlines
on vscode
![image](https://user-images.githubusercontent.com/16353531/211174396-f6a1663f-033b-4f17-a804-c171dfb9e5a0.png)
Already Fixed
- [x] handle when there's whitespace (fixed in 1.15.0, I think)
This works as expected:
![image](https://user-images.githubusercontent.com/16353531/182361144-717cb3c5-7596-46a9-ae5c-30e8d2437673.png)
But not this:
![image](https://user-images.githubusercontent.com/16353531/182361059-b5fcb70e-8bfc-41ca-932a-a1247068e797.png)
Reproduction: can reproduce in Dark+ theme in vscode.
Feature-wise improvements:
- [ ] Highlight polymorphic variants (it's complicated because variants are represented sometimes simply as strings (ie without locations information that usually should come for a symbol)
It's also unclear how to highlight them, given if they have the same color as ordinary variants (value constructors), then it's more difficult to differentiate between them at first glance, but since the color palette that the standard lsp token types give is limited, other option is to color them as modules/types, which is also far from ideal.
- [ ] Adding semantic information
Currently, we simply send syntax-based semantic tokens, which doesn't really have much language semantics. By real semantic information, I mean that we could have different highlighting for ref
values (rust-analyzer underlines mutable values, for example) OR highlight differently functions and values (currently, we highlight a function differently from a value only if we see syntactically that it's a function -- it comes as the function in function application or defined as a function in let binding, ie. has arguments).
In order to do so, we need information from the typed tree, ie. see the type of a value and, thus, get more semantic info about it, e.g., it's a function or a ref value. However, semantic tokens shouldn't work solely based on typed tree because there're often a lot of type errors during development in OCaml.
In my understanding, we can produce semantic tokens for symbols based on the parsetree by folding over it, but also fold over the typed tree and get information for symbols of interest and update the tokens initially produced. I'm concerned that it almost double amount of work and may increase latency.
- [ ]
ref
values have special highlighting - [ ] functions have special highlighting
- [ ] partially applied functions have special highlighting
Implementation-wise improvements:
- [ ] test delta implementation
- [ ] write some code to measure performance of semantic tokens computation (especially before rewriting it using visitors - see below)
- [ ] rewrite the folding over parsetree using visitor design pattern?
We could use ppxlib's already implemented visitors (but we can't have ppxlib as ocamllsp dependency, so we could copy paste required files?) or use visitors
library to generate visitor for the parsetree (adding it as a dep or not? if not, we just copy the ppx-generated code) We could write visitors by hand but that's too much boilerplate code.
Using visitor pattern, we could cut down on boilerplate & have much cleaner implementation.
Where did you get Token_type.module_
? I looked over the lsp spec, and I see no mention of this token type.
The name of the variable doesn’t reflect the underlying lsp token type we use. Its value (which is 0) is the index in Token_type.list, which should be “namespace.” Does this help?
The response we produce for textDocument/semanticTokens/full/delta
seems bugged (there seem to be overlapping tokens), but I can't come up with a reproducible case. There's a vscode tool that can help debug: https://code.visualstudio.com/api/language-extensions/syntax-highlight-guide#scope-inspector.
The name of the variable doesn’t reflect the underlying lsp token type we use. Its value (which is 0) is the index in Token_type.list, which should be “namespace.” Does this help?
It does. However, I think this level of indirection is unnecessary.
In addition to polymorphic variants, I found a few other example of code that doesn't have semantic token support:
-
module names in module signature constraint (
with module Foo = ...
) constraint aren't highlighted as regular module (red in the picture) name, they default toconstant.language.capital-identifier.ocaml
(yellow). Note thatwith type
constraints don't have this issue. -
calling functions in record fields doesn't color their arguments correctly (including named arguments)
-
building records with the short syntax for
field=field
also fails to color.
Another spot lacking semantic color is open
declarations in .mli
files (they work fine in .ml
files though).
Also, related to #1139.