vscode-docs icon indicating copy to clipboard operation
vscode-docs copied to clipboard

Inconsistencies, Bugs and Generally Confusing Documentation/Guides for Theming and Syntax-Highlighting

Open iDad5 opened this issue 1 year ago • 13 comments

Does this issue occur when all extensions are disabled?: Yes/No

unrelated

  • VS Code Version: all
  • OS Version: all

Steps to Reproduce:

  1. Try to learn how to build your own custom Theme
  2. Spend hours and hours on research trial and error and end up confused and somewhat frustrated...

I'm not 100% sure if my issue is really a bug report - in parts it's also a feature request, contains questions and even has elements of a tutorial.... sorry..

Background disclosure: I'm an engineer/developer and designer (architect by education) and have decades of experience in both fields. Which is worth nothing if you don't keep learning new things every day. ..

So I decided to build myself a theme for my favorite editor.

Let me share my experience:

Getting Started at Color Theme

I stared with: https://code.visualstudio.com/api/extension-guides/color-theme Which seems a logical point to start. But even as the page is titled 'Color Theme' the first half of the document is about changing settings.json to modify your currently active them... (It might just be dumb old my, but it took me quite a while to understand that this was totally not what I was looking for.)

Then 'Create a new Color theme' started with the sentence

"Once you have tweaked your theme colors using workbench.colorCustomizations and editor.tokenColorCustomizations, it's time to create the actual theme."

So I had to tweak my theme before even creating it? I felt like in an old Start-Trek episode caught in a logical time-loop.

After several days of research and experimenting I know what the intention of the tutorial is. Actually not to start a theme from scratch but to modify an existing theme. But as that is actually not explained anywhere, that is totally confusing. A series of mostly unexplained steps follow, including a reference to tmThemes (I absolutely had no idea what that 'TextMate' theme was and why I would want/need it...) a link to ColorSublime didn't make anything clearer - to the contrary.

After finding this video: https://www.youtube.com/watch?v=FeApSxfazVg that also is somewhere to be found on the docs, but not linked on the afore mentioned page (why?). The video helped, but as video tutorials go they mostly lack (technical) depth and the syntax highlining, which I was especially interested in and where the TextMate theme thing would have been really helpful to understand, was skipped over quickly. (I'll explain later why I feel that's a bad thing.)

So i stared from scratch in Yeoman. And what I got, while a good starting point but very different from the situation described in the 'Color Theme' page. By the way why does the link behind 'Yeoman' lead to yeoman.io and there is no link to Your First Extension. (May be my fault that I wasn't fully aware that a theme is an extension, when I stared out on the journey, but I strongly feel that a general Explanation of the concept of extensions and themes as one use case should be added/linked)

I don't want this to sound like a rant, and I will not go on in detail like I did until now, but you might trust me that the experience in the next steps was mostly similar.

Workbench Colors

The most valuable resource to have is https://code.visualstudio.com/api/references/theme-color which I found only after my second go around with 'Creating a VS Code Theme' on CSS-Tricks which is linked at the very end of the page I started from, and isn't bad, but full of things I wasn't interned in and also lacking some detail. The 'Theme Color Reference' has two fundamental flaws: -no documentation on inheritance. -[almost] no documentation on where to find the affected element.

There is one Element, that even has two own issues panelSectionHeader.border #1 2 It still took me 2 hours to figure it out. (Can you?) Another trick one was the listFilterWidget where I found the page of the January 2019 update as a reference, it just wouldn't work as described and when I finally found it it looked an felt totally different. Sidenote: What I find bothersome is that I didn't know about this useful feature (now I know!) and it took me an hour to find about that as it seems there is either no mention in the docs or unter a totally different name. (I actually found now it under 'Get Started-> User Interface ') but the problem ist that there is no connection anyone can make automatically between listFIlterWidget and advanced tree navigation adding that to the list would at leas help a bit. I also found a page that attempted to solve that issue https://sw27.net/vsc/color-theme-guide/ but it's incomplete and a bit outdated, and linked nowhere.

Syntax Highlight Guide

This guide is linked the Color Theme page. The page is correctly (as I know by now) sorted under 'Language Extensions' and contains exactly one paragraph about 'Theming'.

I think that is a severe oversight! syntax highlighting is an essential part of theme creation, in my mind probably the most important. I had to research around the web what 'regular TextMate' themes means, nowhere is the use of plists explains even though they are used in the example https://github.com/microsoft/vscode-extension-samples/tree/main/theme-sample which is also linked nowhere in the docs... Also there is no mention I could see about the lack of support for background in vs-code's version of the TextMate themes. Only after hours of trying I found this old Issue.

...

Semantic Highlight Guide

This is one cool feature and a lot of thought and work has gone into it. Using it for themes though is hard. It starts with the same problem as the syntax highlighting guide from my viewpoint as a would be theme developer: It is written to at least 80% for language extension developers.

The KEY INFORMATION about semantic highlighting for theme developers is nowhere to be found in the docs and not in the wiki-page either. I found it here after spending countless hours of experimenting with both kinds of syntax highlighting:

Strings and string placeholders might not be the typical use case for semantic highlighting . I'd recommend to leave that to the TextMate grammar. 👉The semantic highlighting is something to go on top of syntax highlighting👈 and typically focuses on symbols where a full AST and resolution is needed to evaluate the type of a symbol. E.g. only when resolving an identifier in the file or project it's possible to know if its member, parameter, function and so on.

As can be found HERE

repeat:

"The semantic highlighting is something to go on top of syntax highlighting"

That's actually nearly the same sentence that is somewhere high in the semantic highlight guide, but only in the given context of the cited comments it became clear to me that it is recommended to use semantic highlighting only where classic TextMate scoped don't work. Despite semantic is far superior in it's understanding of the syntax.


The property semanticHighlighting defines whether the theme is ready for highlighting using semantic tokens. It is false by default, but we encourage all themes to enable it. [*]

Don't forget however:

Semantic highlighting wins against syntax highlighting as the semantic token provider has a better understanding of the source than the regex based TextMate grammar. [*]

And that's cool because:

Semantic highlighting enriches the syntax coloring based on symbol information from a language service that has the full understanding of the project. Based on this understanding each identifier gets colored & styled with the color of the symbol it resolves to. A constant variable name is rendered as constant throughout the file, not just in its declaration. Same for parameter names, property names, class names and so on. [*]


So after finding all those more or less undocumented Information I'm here:

  • semantic highlighting is encouraged
  • but only for things I cannot reach with TextMate Syntax
  • semantic highlighting makes it impossible for TextMate scopes to be use, when built in semantics are rather broad *
  • it does not help to have higher TextMate specify, semantic always wins *
  • adding custom semantic TextMate scope mappings is possible - but (to what I've guessed, experimented and understand) not within a theme alone, one needs to write a language extension. (Am I right?)
  • it is not possible to only augment an existing grammar you need to replace it.* (I could see myself adding something to the typescript grammar to differentiate private, protected and public modifier keywords, but meddling with all the grammar just for that - no way. I do hope for semantic tokens that will help, but I don't have a good vibe about it. *)

Those are my key-learnings after a lot of effort. I do hope that my findings might help some others on their journey, and I would really appreciate it if my findings would result in improvements in the docs.

My Random Collection of Issues, Questions etc.:

  • the wiki-page [dated May 3, 2020] states that TypeScript and JavaScript are the only languages with built in support for semantic highlighting, is this still true? (At the end of the page there are several outdated links...)
  • decorators as semantic tokens do not work in typescript even if enabled in the complier. Or am I doing something wrong?
  • the semantic highlight guides says: "The foreground needs to follow a color format as described in Color formats. Transparency is not supported." That is not / no longer true. The linked color formats include transparency and it also works with semantic highlightning.
  • the last example in Custom TextMate scope mappings does not work in a theme on it's own, one needs a fitting language extension, is that right? (I tried with the mapping of "keyword.export": ["keyword.control.export.ts"] - no dice).
  • the theme-sample uses a plist (with extension .tmTheme) It's from 2017 - is that still a supported option (if so, is there a documentation?) or is this simply outdated?
  • the default themes contain semanticTokenColors for example for 'newOperator' etc. it seems they do not work, or am I mistaken?
  • when I tried to learn from the default themes: I find that the scope inspector indicates for the parameters that they are are resolved as semantic tokens with type parameter and modifier declaration os variable.parameter respectively. (the same as in the example on top of the guide.) the foreground is in both cases following variable.paramter and the color #9CDCFE is applied. I cannot find such a token in any of the theme files. The only tokens using that color are textMate tokens defined in tokenColors and nothing in any of the semanticTokenColors of that sort. I'm confused by this! ❓❓❓🤷‍♂️ It seems to me that after all my research and experimenting I'm still missing crucial parts of the concept. Or does that mean that there is some kind of automatic reremapping of the predefined TextMate scope Mappings???

If a theme has semantic highlighting enabled, but does not contain a rule for the given semantic token, these TextMate scopes are used to find a TextMate theming rule instead.

it could make sense to me however if that would be:

If a theme has semantic highlighting enabled, but does not contain a rule for the given semantic token, these TextMate scopes are used to find a ~~TextMate~~ semantic theming rule instead.


I apologize for taking up so much room with this, but I sincerely hope that my sharing of my experience and findings can help others and encourage the team behind that great project to improve the issues I encountered. After all, if it is so hard to implement all the great features you develop the hard and inspired work you do for the community wil less valuable than it could and should be.

iDad5 avatar Sep 11 '22 01:09 iDad5

I 101% agree with you

the same issues plagued me whening researching textmate base syntax highlighting grammars I actually ended up making an extension just for the json based Textmate grammar yes vscode even supports json based textmate along side plist ones but doesn't support yaml, even tho syntax-highlight-guide recommends it no error checking or intellisense support, requires converting on every change and even the example picture has syntax highlighting errors

and then the same type of issues when moving onto semantic highlighting in javascript/typescript language servers

Textmate grammar should be very small and only highlight very basic items with semantic highlighting doing all the hard context aware and language specfic structure

RedCMD avatar Sep 11 '22 07:09 RedCMD

@RedCMD thanks for you reply and the link to your extension I definitely will have a good look at it.

First steps to improve the documentation could be

  • to add code simple examples to the docs like mdn does
  • adding comments and (better) usage hints to the samples and keeping then at least somewhat in sync with the doc.

As an example of my last request/whish: the semantic tokens sample. It is not clear to me if it is only intended to work with languages that don't already provide semantic tokens or if I can extend sematic token for e. g. typescript that way. And do I have to provide only new tokens then or do I have to provide all of them always.... It's a great feature, but not really usable without a lot of experimenting and guess work.

I'd also like to add the link to the above mentioned video here in the docs Video to that it can be linked easier in the relevant doc-pages.

iDad5 avatar Sep 11 '22 11:09 iDad5

In addition to improving documentation, it would be good to standardise Text Mate scopes, to facilitate the development of grammars and themes.

xiaoxi-david avatar Sep 12 '22 17:09 xiaoxi-david

@iDad5 Thanks for reporting on your 'journey through color themes' and your findings.

Let me quickly try to reply to some of the questions you raised

the wiki-page [dated May 3, 2020] states that TypeScript and JavaScript are the only languages with built in support for semantic highlighting

From the built-in languages it's still just JavaScript, but most major language extensions (Python, C#, C++) have adopted semantic highlighting.

decorators as semantic tokens do not work in typescript even if enabled in the complier

Yes, TypeScript still does not emit the special decorator token. There's an issue for that

(I tried with the mapping of "keyword.export": ["keyword.control.export.ts"] - no dice).

"keyword.export" would be a semantic token / modifier (not a TextMate scope), export is not a modifier that's standardized, and I don't think TypeScript uses that.

Plist is still supported. We'd rather not document this; things are complex enough. I know that's frustrating, but many things in TextMate land are undocumented and even we have to find out things my looking how it is used

I believe the semanticTokenColors sections with newOperator, stringLiteral... in the default themes was something the C++ team wanted. These tokens and non-standard, so please don't use.

Semantic token parameter is mapped to textMate scope variable.parameter. In the default Dark theme the best match for that scope is the rule for variable, which has color #9CDCFE. See https://github.com/microsoft/vscode/blob/0dce868b85978716d4fc7bc48b6a6a54c6a39fd0/extensions/theme-defaults/themes/dark_plus.json#L90

aeschli avatar Sep 13 '22 09:09 aeschli

Let me quickly try to reply to some of the questions you raised

Thank you!

"keyword.export" would be a semantic token / modifier (not a TextMate scope), export is not a modifier that's standardized, and I don't think TypeScript uses that.

Thx. After some more hours of deeper digging I get closer to an understanding of the concept. By now - after my 12th rereading of the documentation I realize that it is more correct than I felt after reading it the first 7 times. ;-) Still I think, that from the perspective of someone not deeply familiar with the topics at hand it's very easy to misunderstand it. If you want, I would lend a dumb layman's' eye for improving for the likes of me.

I improved my experiments a lot, but still with no final success. I pieced together that I can (in theory) contribute semanticTokenTypes in package.json and contribute the fitting semanticTockenScopes - through research around the web I'm rather positiv, that I would also need a language extension to further implement those newly contributed token types, which seems to be doable for languages not already defined, but for JavaScript and TypeScript I cannot (yet) find a way to add new ones to the existing ones - and for now I'm under the Impression that that's probably not intended?

Related sidenote: I'm somewhat unhappy with the variable.defaultLibrary semantic token. I do understand that this is due to the shortcomings of the historically grown JavaScript / ECMA specs. I do not have a good idea how to improve that, but I feel that the varible.defaultLibrary literally covers too broad a scope to be useful in syntax highlighting and my half-educated guess would be that will be the case also in other languages.

iDad5 avatar Sep 13 '22 21:09 iDad5

vscode.languages.registerDocumentFormattingEditProvider(DocumentSelector, DocumentFormattingEditProvider); is the function that handles tokenizing the document before passing it back into vscode https://github.com/microsoft/vscode/blob/03ada8a3e84ab3453f1efce83f373b42196b0f85/src/vscode-dts/vscode.d.ts#L12861 image image

JavaScript and TypeScript I cannot (yet) find a way to add new ones to the existing ones

seems you cannot split/add tokens into tokens without reimplementing everything

I do not even understand what semanticTokenTypes and semanticTokenModifiers is used for in package.json semanticTokenScopes seems to just remap (or add) certain scopes to others in a certain language

the mapping of colours is done by the current theme https://github.com/microsoft/vscode/blob/03ada8a3e84ab3453f1efce83f373b42196b0f85/extensions/theme-defaults/themes/dark_plus.json#L194 image

and then theres also "colors" under "contributes" image used by vscode.window.createTextEditorDecorationType()?

RedCMD avatar Sep 15 '22 10:09 RedCMD

@iDad5 @aeschli @RedCMD First of all my big "thanks" for your comments and explanations above. I'm really glad that I don't seem to be the only one struggling with VS Code Extensions. ;)

I'd like to chime in and tell a little bit about my struggles during the last week. As an experienced developer, and avid user of everything 'documentation-related' (docbook, tex/latex, metafont, rest, asciidoc, ...), I started to develop a "simple" syntax highlighting for the "Noweb" Literate Programming tool 7 days ago .

My first try was to use tmGrammars only, because this is what is shown everywhere. It's mainly featured in the online docs about VSCode Extensions and looks simple enough that it could be done as an "afternoon project". However, I got stuck soon and realized that for properly parsing Noweb syntax, I'd have to keep track of a parsing mode ("LaTeX" sections vs "code" sections) for some of the keywords. I also wanted to detect "undefined" keywords and support folding, so I needed more.

Based on the CWEB extension, I then switched to Typescript. I could now easily parse and highlight the Noweb keywords, but when I tried to delegate the work of highlighting all the LaTeX syntax to the internal grammars within VS Code, I started to "bang my head against a wall". There were no examples for this to be found, no mentioning of how to do this in the (API) docs...at least I didn't find them.

I desperately searched for an easy access to the internal LaTeX grammar, so that I could call it and get a list of tokens returned for my own extension.

I ended up cloneing stuff like the VSCode and vscode-textmate sources, even trying out hacks like described in https://github.com/microsoft/vscode/issues/46281 in order to get at the internal grammar representation of the LaTeX parser. I also unpacked the ASAR archive once, in order to figure out one of the error messages I was getting.

All to no avail, the internal interfaces (TMRegistry -> Oniglib) have changed too much again since then. Eventually, I started to write my own (very simple) LaTeX parser in Typescript, using a bunch of simple RegExes. It was rubbish, of course, but I got some more highlighting than without it.

It was only after reading the comment by @iDad5 about how the "semantic layer" tokens always have precedence over the "syntax layer" tokens (from the TMGrammar files), and after a good night's rest, that I finally "got it".

I am now very close to publishing my Noweb Extension, which seems to work as I expect it...at least locally. I ended up not doing tmGrammars and not using a Typescript extension, but both! This may seem obvious in hindsight, but using both concepts in one extension simply didn't come to my mind at first, because I haven't seen this "in action" somewhere else (pointers anyone?).

In my extension I have a very simple tmGrammar, defining "text.tex.latex" as the basic scope for a Noweb document. The Noweb section is defined as "source.noweb" without any additional syntax highlighting. Then, in the Typescript part of my extension I can add the highlighting of the "special" Noweb keywords on top as I need it.


So, what could be improved? I guess what could've helped me to get on the right track a little sooner, would be an overview of the possible (canonical?) setups for handling "syntax highlighting" in VS Code Extensions:

  1. Textmate grammar files only.
  2. Programming semantic (and other) providers in a language like Typescript.
  3. Programming a full and separate Language Server.
  4. A mix of 1+2.
  5. Other approaches that I don't know about (yet)?

, each with a short description and maybe a discussion of the benefits (pros vs cons) of the approach. This should be one of the first sections in the documentation about syntax highlighting. From here, links would be helpful, further down into the "rabbit hole" to the single topics which already exist.

Another thing I missed was a full hands-on example of an extension, that I could take as starting point for my own. Sure, there are a ton of examples for the single API features and all the different "Providers". But describing all the single bits in detail, doesn't tell a user how to combine all the "puzzle pieces" into a working and useful extension.

In the past, I have done something similar in the SCons project (which I'm still a part of, but not that active at the moment). This is a build system (like CMake, or even autotools), where the users can write their own extension for supporting build steps with new compilers and other tools.

The single commands and pieces like "Builder" and "Emitter" are completely described in the rather extensive SCons User Manual. But in the user mailing list, questions about how to write your own "build extension" wouldn't stop. And even worse, the number of badly written "builder extensions" grew over time. So I sat down and wrote a "quickstart" tutorial, named ToolsForFools.

It starts with the basic command line calls that you have to make for the compiler (JAL) in question. Then, the first basic builder "Tool" gets developed further and further, adding more functionality...and with that also more complexity. In a lot of places direct links into the additional documentation are given, so a user can (hopefully) follow along and decide "Nah, I'll stop here. That's all the functionality I need for now!" at any time.

dirkbaechle avatar Sep 18 '22 13:09 dirkbaechle

@dirkbaechle: Glad I could help you some how. I obviously do agree with you about the necessity of a better tutorial for beginners concerning syntax highlighting ( +and theming) - like with the situation you describe I have the feeling that there are to many themes out there that are just variations of some popular ones, and some of them poorly done too. That is a pity when you look at the popularity of the project itself...

One thing I'd like to ad to your "5 point plan": it would be very good make clear the fundamental differences between extending syntaxes for languages that have built in support and creating one for new languages. Some of the techniques only work for one or the other an that's often unclear.

iDad5 avatar Sep 18 '22 16:09 iDad5

Will Tree-sitter microsoft/vscode#161256 simplify theming and syntax highlighting?

xiaoxi-david avatar Sep 19 '22 17:09 xiaoxi-david

I couldn't help but stumble over Tree-Sitter during my research journey - but I don't know really anything about it. What I've read in passing sounded interesting, but I also think that adding yet another tool/option isn't the real issue or solution.

My experience was more like I already have (more than) enough complexity, I missed a well-structured documentation/tutorial with examples the most. Also, over time I found that both options with their pros and cons could be great as they are set up, but they don't play too well together yet. It actually seemed to me that the system was developed with the core development of improving the built-in themes first and the community of 'theme-developers' not really at all.

I guess that tree-sitter could help. probably a lot, but it won't in and of itself solve the other issues I'd guess.

iDad5 avatar Sep 19 '22 17:09 iDad5

Tree-sitter is a parser similar to TextMate but unlike TM, TS's incremental updates are optimized based on the structure of the language rather than line by line they both have their pros and cons but generally TS is much faster and very hard to get working in vscode, as it's a C program, not js

RedCMD avatar Sep 24 '22 20:09 RedCMD

Thanks to everyone for chiming in. I think we should continue this in a [discussion] (https://github.com/microsoft/vscode-discussions/discussions) so other users can chime in and help and benefit. I'm sorry that I can't reply to all ideas and suggestions being brought up, it's too many.

aeschli avatar Dec 06 '22 17:12 aeschli

I'm moving this to vscode-docs to improve the docs based on the feedback

aeschli avatar Dec 06 '22 17:12 aeschli