Decide on syntax for Lua doc comments
We need a syntax that doesn't clash with standard Lua doc comments (ie. as used by lua-language-server).
These start with a triple dash (---) and can contain annotations (eg. @param).
--- @param name string
--- @paramz 1
--- test here
---
--- |link-to-thing|
Note that Neovim highlights anything that looks like an annotation, even if invalid (eg. @paramz). See also how language-lua-server is looking at the contents of the comments and using them to power its static analysis features (and report problems via line diagnostics):
Also note how links (eg. |link-to-thing|) are highlighted, ~although I'm not sure why~... The lua-language-server docs say you can refer to other symbols using Markdown syntax:
---@alias MyCustomType integer
---Calculate a value using [my custom type](lua://MyCustomType)
function calculate(x) end
According to :InspectTree, this is just comment_content:
(comment ; [496, 2] - [496, 21]
content: (comment_content)) ; [496, 4] - [496, 21]
And, after pressing a (to reveal anonymous nodes) and I (to show language source):
(comment ; [496, 2] - [496, 21] lua
start: "--" ; [496, 2] - [496, 4] lua
content: (comment_content)) ; [496, 4] - [496, 21] lua
But :Inspect reveals more detail:
Treesitter
- @comment.lua links to Comment priority: 100 language: lua
- @spell.lua links to @spell priority: 100 language: lua
- @comment.documentation.lua links to Comment priority: 100 language: lua
Semantic Tokens
- @lsp.type.keyword.lua links to Keyword priority: 125
- @lsp.mod.documentation.lua links to @lsp priority: 126
- @lsp.typemod.keyword.documentation.lua links to Tag priority: 127
Anyway, in order not to clash, here are some valid comment syntaxes:
-- this is a valid lua comment, but only a single line one.
--[ also good, also single-line.
--[= still good, but also single-line.
--[=[ not good unless paired with ]=]
--[=[ but note...
it is multiline...
-- can precede every line with `--` if you want
]=]
In the end, I think I like this the best:
--[[[
@mappings
# This is a heading
## A subheading...
### A sub-subheading
stuff in here...
#### More...
`ls` to see more...
```
console.log('code in here')
```
for (let i = 0; i < 10; i++) {
// hmmm..
console.log("crap here");
return false;
}
> A block quote... More block quotes?
- This is a list.
- More list...
**THIS IS A WARNING.** I _emphasize_ this...
--]]
-- The only thing I don't like 👆 is that you end with `--]]` and not `--]]]`,
-- so it's not symmetrical...
--[[[
@option g:FerretLoaded any
To prevent Ferret from being loaded, set |g:FerretLoaded| to any value in your
|.vimrc|. For example:
```
let g:FerretLoaded=1
```
--]]
I attempted to get this highlighting the embedded Markdown via injections, but my current attempt only highlights **bold** stuff:
Something to do with :help vim.hl.priorities:
vim.hl.priorities *vim.hl.priorities*
Table with default priorities used for highlighting:
• `syntax`: `50`, used for standard syntax highlighting
• `treesitter`: `100`, used for treesitter-based highlighting
• `semantic_tokens`: `125`, used for LSP semantic token highlighting
• `diagnostics`: `150`, used for code analysis such as diagnostics
• `user`: `200`, used for user-triggered highlights such as LSP document
symbols or `on_yank` autocommands
semantic_tokens defaults to 125, which means that you can see the Markdown highlighting (via Treesitter, which is priority 100) when you load the buffer, but then when the LSP is ready, it overrides the highlighting (setting comment as a "semantic token") with a high priority, and causing the Markdown highlighting to go away.
I tried hacking it to see what would happen:
vim.hl.priorities.semantic_tokens = 95
it ends up showing the more Markdown syntax even after the LSP has loaded, but it breaks the lua-language-server comments:
:InspectTree shows it being picked up as a heading:
(section ; [505, 0] - [546, 4]
(atx_heading ; [505, 0] - [506, 0]
(atx_h1_marker) ; [505, 0] - [505, 3]
heading_content: (inline ; [505, 4] - [505, 21]
(inline))) ; [505, 4] - [505, 21]
And with a and I:
(section ; [505, 0] - [546, 4] markdown
(atx_heading ; [505, 0] - [506, 0] markdown
(atx_h1_marker) ; [505, 0] - [505, 3] markdown
heading_content: (inline ; [505, 4] - [505, 21] markdown
(inline))) ; [505, 4] - [505, 21] markdown_inline
:Inspect:
Treesitter
- @comment.lua links to Comment priority: 100 language: lua
- @spell.lua links to @spell priority: 100 language: lua
- @markup.heading.1.markdown links to Title priority: 100 language: markdown
- @spell.markdown links to @spell priority: 100 language: markdown
Semantic Tokens
- @lsp.type.comment.lua links to Comment priority: 125
FWIW, did this experiment using this ~/.config/nvim/queries/lua/highlights.scm:
;; extends
; "@mappings" @keyword.mappings
; (#set! "priority" 200)
; [
; "@mappings"
; ] @keyword @nospell
and this ~/.config/nvim/queries/lua/injections.scm:
;; extends
; Inject markdown into multiline comments that start with `--[[[` (note the
; extra `[`):
(comment
content: (_) @injection.content
(#lua-match? @injection.content "^%[")
(#set! injection.language "markdown")
(#offset! @injection.content 0 1 0 0)
(#set! injection.combined)
(#set! injection.include-children))
; problem, once LSP kicks in, it sets @lsp.type.comment.lua, linking to Comment
; with Priority 125 (sigh) "Semantic Tokens"
I think, I could make a custom LSP server that attaches to the buffer (at least, Claude says you can do that and you and Neovim will merge all tokens, additively) and provides its own semantic tokens, but that seems like boiling the ocean for something that should be doable much more simply...
Ok, so I've concluded that:
- I can't highlight stuff that the tree-sitter parser hasn't parsed into nodes.
- Tree-sitter won't look inside comments, so there are no internal nodes to highlight.
- Changing that would require creating a new parser, which would be a lot of work.
- Even if I do create such a parser, the LSP server's semantic tokens will still win because they have higher priority.
- Creating a custom LSP server may not be so much work, as I already have a Lua parser and a Markdown parser in the docvim project; I could spike out a prototype there.
- If I can add semantic tokens from that LSP server, I could also have it provide autocompletions etc that would make writing documentation easier.
- I think I'm going to give it a shot.
Good ol' Claude Code estimates:
⏺ 3-5 weeks of work. The project has excellent foundations - existing Lua and Markdown parsers with precise location tracking. Main tasks are adding LSP server boilerplate (tower-lsp), and building the comment extraction → Markdown parsing → semantic tokens pipeline.
The hard parsing work is already done. This would integrate cleanly with the existing architecture.