Steps towards an ANTLR4 based LSP for vscode-nextflow
Warning: DON'T MERGE INTO MASTER - this needs to go into a feature branch ;)
This PR is going back to a chat I had with ~~Tomaso~~ Paolo (😄). It is purely to provide visibility of my current development effort towards an ANTLR4 based language server protocol implementation of nextflow.
The goal is to have state of the art code completion, variable referencing, type checking, linking to definition statements etc. which is still a bit to go.
Implemented:
- [x] support to autocomplete most configuration scope variables in configuration files
To be implemented:
- [ ] refactor
parseTreetraversal helper functions - [ ] support rudimentary type inference for assign statements in config file
- [ ] support
processscope - [ ] support
workflowscope - [ ] support import statements
- [ ] support documentation
- [ ] support channels
- [ ] support variables
- [ ] support params ...
@jemunro in case you are looking into something like this as well, lets bundle forces
Hi @uwwint , before you go too much further, I have written an ANTLR4 grammar for Nextflow script and config files. It looks like you already found the Groovy grammar. Let's find a way to add my Nextflow grammars so you don't end up re-inventing the wheel.
Hi @bentsherman, Perfect! Could you point me to the grammar? It's straight forward to integrate. I'll put some documentation here later today.
It's currently sitting in a private repo. I will see about making it public when I get back from holiday next week.
Awesome @bentsherman. Cannot wait
@uwwint I added my grammar specs for nextflow config files and scripts. I was working with a "tiny groovy" spec to keep things simple, but I think we can move forward with the full groovy spec for now. I haven't set up the Groovy-optimized fork of antlr4 so I can't test these grammars yet, so feel free to run with them.
Hi @bentsherman,
Thanks for the contribution, I have successfully rebased your grammar on the official groovy, split them out, and generated all the compiled code. I will proceed with modifying grammarAdapter to utilise them.
@uwwint I was able to build the grammar and run the grammar test. I decide to use the config files from nf-core/rnaseq because they should provide good coverage of the config language. I'm still getting a few weird errors, seemingly around newlines:
server/configs/rnaseq_base.config
server/configs/rnaseq_modules.config
line 843:91 token recognition error at: ',\n'
-1: message
server/configs/rnaseq_nextflow.config
line 265:114 token recognition error at: '\n'
line 266:0 token recognition error at: ' return obj\n'
line 267:0 token recognition error at: ' }\n'
line 268:0 token recognition error at: ' } else if (type == 'time') {\n'
line 269:0 token recognition error at: ' try {\n'
line 270:0 token recognition error at: ' if (obj.compareTo(params.max_time as nextflow.util.Duration) == 1)\n'
line 271:0 token recognition error at: ' return params.max_time as nextflow.util.Duration\n'
line 272:0 token recognition error at: ' else\n'
line 273:0 token recognition error at: ' return obj\n'
line 274:0 token recognition error at: ' } catch (all) {\n'
-1: message
server/configs/rnaseq_igenomes.config
server/configs/rnaseq_test.config
server/configs/rnaseq_test_full.config
The problem lies with the Groovy grammar, because you still get these errors if you just use the Groovy parser. I'm wondering if the code your ported to typescript in the grammar files need to be updated since I reverted the grammar from Groovy 4 to Groovy 3?
@uwwint / @bentsherman - any updates on this one? (pleasepleasepleaseplease)
I have concluded that a custom ANTLR grammar isn't the right approach. Even if we did get it to work, it would only produce a "parse" tree, which is just an intermediate step to an AST. I think we would have to rewrite much of the Groovy parser to get to a custom Nextflow AST. Uwe has been radio silent since December, so I'm not expecting him to come save the day anytime soon.
Instead, I think we will implement the LSP in Java/Groovy, so that we can reuse the Groovy parser and Nextflow AST transformations. It will have the same weird errors as Nextflow, but we've made a lot of improvements to error messages recently, so I feel more confident now that we can still provide a good user experience, even if we can't always identify the error down to the exact token.
The thing that clicked for me is the Nextflow DSL is all valid Groovy syntax, so we can just use the Groovy AST, perhaps with a few syntax transformations, to do everything we want to do with the LSP.
One more thing... Jordi started on a Java-based Nextflow LSP here: https://github.com/jordeu/nextflow-language-server
So we're already on to a good start!
Close in favor of #30