pigeon icon indicating copy to clipboard operation
pigeon copied to clipboard

Build failure only when --optimize-grammar is set; grammar with not referenced rule

Open breml opened this issue 5 years ago • 5 comments

The following grammar results in build failure, if generated with --optimize-grammar. The same grammar builds with out problem, if the option is not set.

Minimal example:

X ← z:Z
Y ← Z
Z ← ('Z' { return nil, nil })?

Build error:

(*parser).callonZ3 undefined (type *parser has no method callonZ3)

breml avatar Jul 17 '18 20:07 breml

Hello Lucas,

I ran into the same issue and created this repo with what I had as minimal reproducible grammar (yours is smaller, though): https://github.com/mna/pigeon-bug-optimize-grammar-2. I tested it with your fix in the pending PR and it does work, is there any side-effect preventing the fix from being merged?

Thanks, Martin

mna avatar May 09 '19 15:05 mna

Hi @mna If I remember correctly, the reason why I did not yet merge this PR is, that it only solves part of the problem. I had cases where the optimize still failed. That being said, we can merge this one and at least improve the situation, even though there are still failing (edge) cases.

breml avatar May 10 '19 15:05 breml

Ah, yeah indeed I tried it on my full grammar and it does fail elsewhere with this fix. Might be worth holding up, as it might break grammars that are otherwise working today (for those that didn't run into the issue that it solves). I'll try to find some time to investigate further next week.

Thanks, Martin

mna avatar May 10 '19 17:05 mna

I found the old commit and my notes about the bigger problem I found when I wrote the PR #71.

The thing is, that the added search algorithm in ast_optimize.go is not able to find unnecessary rules, if these rules do form a cycle, but the entire rules cycle is no longer reachable from one of the entry points. Such cycles can be produced by the grammar optimizer. I found such a case in the go-thrift grammer when I tried to use the optimize option. I reduced the grammar to the following minimal example, extracted out of the optimized grammar from go-thrift:

{
package issue70_cycle
}

TypeDef ← annotations:TypeAnnotations?

FieldType ← SubType

SubType <- typ:FieldType annotations:TypeAnnotations? 

TypeAnnotations ← annotations:TypeAnnotation*

TypeAnnotation ← value:(value:'X' { return value, nil })?

So the bottom line is, that the search algorithm needs to be extended with something like this:

  • start from the protected rules (entrypoints)
  • follow all the rule uses rules and mark the used rules
  • all the not marked rules can be deleted

breml avatar May 17 '19 19:05 breml

Optimizer also needs to detect that rules can be dropped, if they are only referred to by itself (closed cycles), see comment.

breml avatar May 20 '19 15:05 breml