Numbat lsp
Hey @sharkdp,
Contributing to the numbat standard was so much a pain in the ass that I started working on an LSP.
Currently, it only outputs some errors, but it’s still very buggy; most of the code I wrote lives in numbat-lsp/src/main.rs. Almost everything else comes from this repo that I cloned.
https://github.com/user-attachments/assets/6831dff5-662b-42be-9755-a6d237ad5284
Adding support for completion + gotodefinition doesn't seem impossible. Maybe we could even push further and get the types to work.
I was wondering if this was the kind of stuff you would be willing to accept in the future?
Currently, I’m hitting this bug a lot: https://github.com/ebkalderon/tower-lsp/issues/413 I’m not sure, but maybe tower-lsp is not really prod ready, and we'll need to move out of it? 😩
Awesome stuff :smile:. LSP is definitely something I'd like to look into. I only had a very brief look into how this works and it seems like you run numbat::Context::interpret* and then collect errors from there. Is this a sane approach to building a language server? What if there are side effects? I would imagine a language server only runs the frontend of the compiler (tokenizer-parser-typechecker), but not the backend (code generation and execution), where no errors can occur.
Just popping in because this is a topic I'm familiar with.
Is this a sane approach to building a language server?
It's not too bad. Rust-Analyzer gets most of its diagnostics from running rustc upon saving the file. There are some diagnostics baked in though, and it does have a front-end and part of the middle end in it for semantic analysis. The sema implemented allows for "go-to definition", semantic token highlighting, completion, and "lightbulb" refactors among other things.
So in reality having a front-end is pretty much required.
The front-end should have a lossless view of the syntax, what this really ends up meaning is having a concrete syntax tree rather than an abstract syntax tree so that refactors don't clobber trivia. This is also great for formatters and rustfmt hits this limitation, and rustfmt has been considering switching to a CST.
Other useful posts: https://rust-analyzer.github.io/blog/2020/09/28/how-to-make-a-light-bulb.html https://rust-analyzer.github.io/blog/2019/11/13/find-usages.html
It's not too bad. Rust-Analyzer gets most of its diagnostics from running rustc upon saving the file.
Ok, but the rustc backend doesn't execute the code. numbat::Context::interpret runs the whole compiler and the execution on the bytecode VM. This is what I meant by: "is this a sane approach …".
So in reality having a front-end is pretty much required.
I have no issue with that. I would also imagine the Numbat LSP to reuse the tokenizer, parser, and semantic analysis stages (prefix handling, name resolution, type checker) of the compiler.
The front-end should have a lossless view of the syntax, what this really ends up meaning is having a concrete syntax tree rather than an abstract syntax tree so that refactors don't clobber trivia. This is also great for formatters and rustfmt hits this limitation, and rustfmt has been considering switching to a CST.
That's something we definitely don't have at the moment. We lose all whitespace information (and things like parens in expressions, see also #102) during parsing.
I would definitely look into generating CST structures with ungrammar as I do find that construction pretty valuable for your parser, so you don't throw information away that would be useful for LSP things and error reporting.
Another thing would be making the lexer infallible in a similar vein to how your parser is already. The most common way of doing that is making an error token that can be emitted. So then the type signature of the lexer would become fn scan(&mut self) -> Vec<Token>. Adding a debug config that checks that all tokens are contiguous is also a good idea for testing. Here's an example of how I've done it in the past.
https://github.com/RossSmyth/meowfile/blob/2a89f23f1a34a27b3275de9d004113deed3f6932/crates/lex/src/lib.rs#L16-L19
It has been a while since any work has been done on this PR, so I considered opening my own, using javascript/typescript instead of rust to hopefully get a more mature library. I however found tower-lsp-server during my research. It is a fork of the original tower-lsp after it was abandoned. It is not yet available as a crate, but might be what's needed for this PR to be able to continue.
My question now is if anyone is interested in continuing with it? If this PR remains blocked I'd like to open an alternative PR, but a rust version would probably be preferable if anyone can get it to work.
Hey,
Yeah you should definitely open a new PR. I can't find the time to work on the lsp myself and I don't think this PR was going in the right direction anyway.
I would still like to keep it open in the meantime because it works (if shardp is ok with it), but once you have something up and running I would even be in favor of closing this one :+1: