gll
gll copied to clipboard
Couple "error on ambiguity" with a fuzzer to generate ambiguous inputs.
Because of how GLL control-flow maps all the way down to machine code, it should be possible for a program-trace-aware fuzzer to hit enough codepaths to cause an ambiguity through its crafted input.
(suggested by @nikomatsakis)
Alternatively, we could try to transform the grammar to reveal ambiguities more easily - e.g. turn A | B | C into (A & B) | (B & C) | (A & C) and aggressively optimize it.
That sort of grammar might not need the SPPF, making it more efficient, if it "succeeds if ambiguous" (not "iff" because that's harder/impossible to obtain in general).
Just for completeness, the other, maybe easier, option is just to exhaustively enumerate all "sentences" in the grammar up to some maximum length. My hypothesis is that grammatical ambiguities don't usually take more than a few characters to show up -- at least for programming languages -- so if you enumerate everything up to some reasonable length, you can be fairly confident you've found them all.
For future reference: @nikomatsakis's idea seems similar to what what I had in mind in https://github.com/AltSysrq/proptest/issues/61 save for the exhaustiveness (as PBT tools like proptest and Haskell's QuickCheck usually randomize tests).