problem-solving icon indicating copy to clipboard operation
problem-solving copied to clipboard

Code highlighting / inspection / IDE tooling

Open patrickbkr opened this issue 1 year ago • 6 comments

We currently don't have a good solution for all of the below:

  • module documentation rendering (for raku.land, the rakudoc CLI app, ...)
  • a sound and maintainable LSP implementation
  • syntax highlighting

There are several projects out there that provide partial solutions:

  • raku --doc: Doesn't work with partially incorrect code (bad for LSP). Slow. Executes BEGIN blocks. Works for docs.raku.org, because we have control over the input. Bad idea for outside code like on raku.land.
  • Chroma syntax highlighter: The best OSS highlighter out there. Not written in Raku. Only a highlighter, no use for information extraction.
  • Comma: Exactly the parser we'd need. But it's not written in Raku and coupled to IntelliJ.
  • Raku Navigator: A greenfield LSP implementation. Works, the first bits are there, but nowhere near a full Raku parser.

Maybe related to https://github.com/Raku/problem-solving/issues/373.


p.s. Tree sitter provides a really nice list of papers on how to build such parsers. I started, but have a few hundred pages ahead of me.

patrickbkr avatar Feb 27 '24 12:02 patrickbkr

Directions we could take:

  1. Port Chroma to Raku. Enables pure raku tools that display Raku code. docs.raku.org and raku.land could benefit. (Also my yet to come Raku TUI debugger would benefit.)
  2. Implement a no-code-gets-executed mode in Raku. Not really doable, as without begin blocks lots of code turns invalid (use, parser modifying modules, e.g. monitor).
  3. Port the Comma parser to Raku. Can this be done? ISTR that the Comma parser was originally generated by a parser that parsed the rakudo grammar (except for the Rakudoc bits). Could that pipeline be reused?
  4. Copy and turn the Raku grammar into a recovering, non-compiling grammar. Is this feasible?

I'm not in any way asking for people to do any of this, but would like to find out which approach we should pursue.

@jnthn, @japhb Can someone familiar with the Comma parser chime in?

patrickbkr avatar Feb 27 '24 12:02 patrickbkr

I think the Raku::Grammar will be a suitable base for things like syntax highlighting. After all untangling parsing, codegen and execution is a major goal of RakuAST. It would still be quite a bit of work to replace Raku::Actions with a non-executing backend, but at least it should be possible and not require super arcane knowledge.

niner avatar Feb 27 '24 12:02 niner

I'll add some notes here around the Raku Navigator around your 3 points.

Module documentation

The Navigator includes module documentation rendering written in Typescript that it uses for autocompletion and hover. This works by looking up the local version of the module, parsing the inline documentation and converting it to markdown. It's very tolerant to errors and does not run BEGIN blocks since it is coded entirely in Typescript. You can also go-to-definition on the module itself.

Autocompletion example: image

Hover example: image

LSP Implementation / parser

The Raku Navigator is a stable LSP, but far from full-featured. As for parsing, it uses the TextMate grammars to tokenize the content, and then uses regex and heuristics to do shallow parsing (not quite an Abstract Syntax Tree). This is nice as it's very error tolerant, but is only focused on the larger elements like classes and functions and does not fully parse Raku. It uses these elements for go-to definition, outline view, autocompletion, and hover. I've considered switching to RakuAST, but my concern is that it may execute arbitrary code and won't be sufficiently error tolerant for an editor. Navigator also runs something similar to raku -c which allows it to show syntax errors to users, which are quite detailed from the rakudo compiler. Overall, the LSP is useful, but ultimately missing quite a few features. The Perl Navigator is another LSP I maintain, and has far more features.

image

Syntax Highlighting

Vscode and Github both use TextMate Grammars to perform syntax highlighting. The Raku Navigator overrides the default grammar with a better one from https://github.com/Raku/atom-language/ . It works very well and is also the basis of colorized bracket matching, and the Raku Navigator's shallow parser. Using the same basis for both syntax highlighting and shallow parsing is nice because any issues with the outline view are also shown directly as syntax highlighting issues. Vscode does not support tree-sitter, so the TextMate grammar will still be necessary to maintain. Although a tree-sitter grammar would also be great for all the editors that use it (e.g. helix).

bscan avatar Feb 27 '24 18:02 bscan

I think the Raku::Grammar will be a suitable base for things like syntax highlighting. After all untangling parsing, codegen and execution is a major goal of RakuAST. It would still be quite a bit of work to replace Raku::Actions with a non-executing backend, but at least it should be possible and not require super arcane knowledge.

Given we decide, that we want a non-executing backend for the RakuAST grammar. It might make sense to put a bit of thought in now what changes would be needed to get something like this to work and then make sure the work on RakuAST proceeds in a way that doesn't frustrustrate such a (future) effort. I imagine there might be design decisions we take now that can make a big difference later when work on a non-executing backend starts.

@lizmat, @nine, @ab5tract Is this sensible, or do I over-think this?

patrickbkr avatar Mar 27 '24 21:03 patrickbkr

@patrickbkr I think it's both too late and too early for that. It's too late because RakuAST is not that far from the finish line. And it's too early because after 3 years it's still not finished. Implementing RakuAST is hard enough as it is and progress is slow. Adding new design requirements that only make it harder does not sound enticing at all. I say let's finish RakuAST as it is and worry about non-executing backends later. The RakuAST already makes it much easier to do such a backend as the grammar really only deals with parsing and nothing else.

niner avatar Mar 28 '24 06:03 niner

@niner That sounds sensible. So be it!

patrickbkr avatar Mar 28 '24 11:03 patrickbkr