rnix-parser icon indicating copy to clipboard operation
rnix-parser copied to clipboard

Verifying parser correctness

Open darichey opened this issue 3 years ago • 7 comments

When making changes to the parser, I am always scared of regressions and properly handling edge cases in the Nix grammar. We want to ensure that rnix-parser parses Nix expressions correctly, but since Nix has no formal specification, the best we can ask in this regard is "does it do the same thing as the reference implementation?" I have built a tool to help us in answering that question: https://github.com/darichey/rnix-parser-tester.

There's a lengthy write-up if you're interested in the details, but the TLDR is you can feed it a Nix expression and it will tell you if rnix-parser and the reference implementation produced the same AST for that expression.

I have used it to verify the correctness of #96, #99, #101, #102, #103, and #106 by parsing all of nixpkgs. The tool provides a way to summarize the comparison results:

$ cargo run -- summary summary/f759fce.json summary/with-patches.json
...
== Summary ==
# equal before: 15924
# not equal before: 8135
# reference impl errors before: 1
# rnix-parser errors before: 1

# equal after: 24060
# not equal after: 0
# reference impl errors after: 1
# rnix-parser errors after: 0

# progressions: 8135
# regressions: 0

This tells us that before applying the patches, 8135 files in nixpkgs were parsed differently by rnix-parser. After applying the patches (and another one I'm still working on), all valid Nix files are parsed equivalently. (There are several caveats to these results which I cover here)

I wanted to make other contributors aware of the tool in case they would find it useful, and also ask if there would be any interest in somehow incorporating it into rnix-parser's testing (probably through CI). Thanks!

darichey avatar Jul 22 '22 19:07 darichey

I wonder if the new typed api would allow you to get rid of your custom ast type. The only difference would be that you would need lots of unwraps, but you already do those unwraps anyway when constructing the custom ast

oberblastmeister avatar Jul 22 '22 21:07 oberblastmeister

Very nice, thank you! :)

Ma27 avatar Jul 23 '22 06:07 Ma27

By the way @darichey please rebase those prs so I can merge them!

oberblastmeister avatar Jul 23 '22 20:07 oberblastmeister

Maybe a silly question but: why don't we just expose the Nix parser as a library and use that in rnix-parser?

ncfavier avatar Jul 27 '22 06:07 ncfavier

Do you mean as opposed to re-implementing the parser as has been done in rnix-parser? The primary reason (I would assume) is the desire for a fault-tolerant parser. It's very difficult to build nice tooling, especially editor tooling like rnix-lsp, around a parser which bails at any sign of trouble (which the reference Nix parser does, while rnix-parser does not)

darichey avatar Jul 27 '22 07:07 darichey

I wonder if the new typed api would allow you to get rid of your custom ast type. The only difference would be that you would need lots of unwraps, but you already do those unwraps anyway when constructing the custom ast

Yeah, the new typed api is much nicer to use, but really has the same main usability problem for this use case: I'm only interested in grammatically-valid Nix, so all the Options throughout the tree are just noise. I thought about trying to augment the nodes macros to also generate an error-less AST, but I don't know if it would work or if that use case is common enough to warrant any changes here (I doubt it).

darichey avatar Jul 29 '22 00:07 darichey

It's a bit unfortunate that rust doesn't allow GATs yet, which could make the "maybe we want Options" ergonomic, and without code duplication (macro-based code duplication still incurs a compile-time penalty)

fogti avatar Jul 29 '22 23:07 fogti