antlr4 icon indicating copy to clipboard operation
antlr4 copied to clipboard

Antlr4 generates invalid go parser

Open Rini-Tikki opened this issue 2 years ago • 8 comments

I am trying to generate lexer, parser, listener and all standard files in Golang for the following grammar: https://github.com/trinodb/trino/blob/master/core/trino-parser/src/main/antlr4/io/trino/sql/parser/SqlBase.g4

Environment Go version: 1.19 Antlr runtime: I tried all the releases from 4.9 to 4.12, and generated parser is always in some error state like : Unresolved type 'ParserRuleContext' Duplicate method 'GetStart' Cannot use 'localctx' (type IFrameExtentContext) as the type ParserRuleContextType does not implement 'ParserRuleContext' as some methods are missing:SetStart(Token) Impossible type assertion: '*FrameExtentContext' does not implement 'IFrameExtentContext'

And we cannot modify the auto-generated files it seems to fix the issue.

Can someone take a look into this?

Many thanks in advance

Rini-Tikki avatar Mar 01 '23 12:03 Rini-Tikki

For the CSharp target, this does not work because of symbol conflict: https://github.com/trinodb/trino/blob/75bc783419482f30c103cf74510498ff480e82fc/core/trino-parser/src/main/antlr4/io/trino/sql/parser/SqlBase.g4#L522

All "base" names in the grammar must be renamed to something else.

The grammar contains the lexer symbol "EMPTY", which is a well-known symbol conflict for CSharp.

But, yes, for the Go target, the generated code is wrong. ParserRuleContext is not qualified via package.

type IPredicateContext interface {
	antlr.ParserRuleContext

	// GetParser returns the parser.
	GetParser() antlr.Parser

	// GetValue returns the value attribute.
	GetValue() ParserRuleContext

	// SetValue sets the value attribute.
	SetValue(ParserRuleContext)

	// IsPredicateContext differentiates from other interfaces.
	IsPredicateContext()
}

This is indeed wrong code. Uses should be antlr.ParserRuleContext. There are other occurrences as well in the generated code.

kaby76 avatar Mar 01 '23 13:03 kaby76

@kaby76 Thank you for looking into it. Is there any fix for this?

Rini-Tikki avatar Mar 01 '23 16:03 Rini-Tikki

I'll work on a PR fix later today.

kaby76 avatar Mar 01 '23 16:03 kaby76

Actually, this requires modifications that I'm not the best to do. The grammar has several problems:

  • a "symbol conflict" with "start" for Go;
  • a "symbol conflict" with "base" for CSharp;
  • uses a grammar symbol attribute that references ParserRuleContext.

If you can work around the problem for now, that would be best. Here are the changes to the grammar, which seems to allow a generated driver for Go (via trgen) to build.

$ !diff
diff SqlBase.g4 Generated-Go/parser/
469c469
< predicate[ParserRuleContext value]
---
> predicate[antlr.ParserRuleContext value]
718,723c718,723
<     : frameType=RANGE start=frameBound
<     | frameType=ROWS start=frameBound
<     | frameType=GROUPS start=frameBound
<     | frameType=RANGE BETWEEN start=frameBound AND end=frameBound
<     | frameType=ROWS BETWEEN start=frameBound AND end=frameBound
<     | frameType=GROUPS BETWEEN start=frameBound AND end=frameBound
---
>     : frameType=RANGE start_=frameBound
>     | frameType=ROWS start_=frameBound
>     | frameType=GROUPS start_=frameBound
>     | frameType=RANGE BETWEEN start_=frameBound AND end=frameBound
>     | frameType=ROWS BETWEEN start_=frameBound AND end=frameBound
>     | frameType=GROUPS BETWEEN start_=frameBound AND end=frameBound

kaby76 avatar Mar 01 '23 17:03 kaby76

The workaround works for now. Thanks!

Rini-Tikki avatar Mar 01 '23 17:03 Rini-Tikki

I am happy to look at this guys, however it might take me a day or so to get some time on it. As I have never come across this error, it is almost certainly the name of the labels and not that there code generation is wrong, or I would have seen that by now. But let me take a look...

jimidle avatar Mar 02 '23 03:03 jimidle

OK, so the problem isn't that start is a keyword in Go, but that using start as a label, causes the generation of a method for the rule that is using start called GetStart. However the RuleContext defined for that rule embeds ParserRuleContext, which also has a GetStart() method, but which causes a covariant definition, which is not allowed in Go and hence the compile error.

In fact, it would be better not to use getters in Go, but that damage is now done - I have to assume that someone somewhere has used the existing GetStart() method, so I cannot make that change (I inherited this of course). I can add start (and anything else that would cause such a clash to the list of keywords that need a translation and that would indeed convert it to start_, which solves the issue, though somewhat unsatisfactorily. I will try and make time to do this as I also want to make an addition in that exact area to be able to access rule names., which is not currently possible in the generated Go code. I could do it generically, but then existing code using something like GetS(), when the label is s would have to change to use Get_S()

In general, I avoid using real words for labels for these kinds of reasons. Using labels such as s and e is idiomatic for Go, and I tend to do that. I hope that helps.

jimidle avatar Mar 02 '23 03:03 jimidle

@jimidle it is still not working for the mentioned grammar.

Rini-Tikki avatar Apr 15 '24 07:04 Rini-Tikki