nextflow icon indicating copy to clipboard operation
nextflow copied to clipboard

Formal grammar and parser

Open bentsherman opened this issue 6 months ago • 3 comments

This PR adds a custom parser for Nextflow scripts and uses it instead of the Groovy parser. The Nextflow parser is generated from an ANTLR grammar, which currently contains a subset of Groovy syntax with some additional rules for processes, workflows, and include statements.

To bypass the Groovy parser, we invoke the GroovyShell with a placeholder script that simply wraps the actual script in a string expression. Then in an AST transform, we extract the string value, parse it with the Nextflow parser, and insert the resulting Groovy AST into the placeholder script.

This approach allows us to control the parsing process -- including the syntax and detecting syntax errors -- while still leveraging the Groovy compiler for execution. In other words, we can define whatever grammar we want, as long as we can "compile" it into a Groovy AST. If you look at AstBuilder, you'll see that it converts processes / workflows / includes into the same Groovy AST structures produced by NextflowDSLImpl.

The hack I'm doing to make this work seems fine but a more robust solution might be to use internal Groovy classes in such a way that allows us to pass our AST directly to the Groovy compiler, instead of going through the GroovyShell and AST transforms. That will take time to understand which components we'll need to rip out. But the advantage is that we don't have to implement our own compiler backend.

I developed this code in a separate project and only just now incorporated it into Nextflow. I haven't tested extensively so there are likely some issues around the edges. Just wanted to finish a basic prototype before the holidays.

TODO:

  • [x] Figure out how to use the generated lexer/parser without manual copy
  • [ ] Bring AstBuilder to parity with NextflowDSLImpl
  • [ ] Restore backwards compatibility
  • [ ] Pass unit tests and integration tests

bentsherman avatar Dec 21 '23 22:12 bentsherman