llvm Ambiguity in grammar for parsing alignment attributes, string attributes

The grammar contains an ambiguity when parsing global variable alignment attributes. More specifically, an alignment attribute of a global variable may be interpreted either as a GlobalAttr or a FuncAttr, and since the list of both global attributes and function attributes may be optionally empty, this leads to a shift/reduce ambiguity in the parser.

From the ll.tm EBNF grammar:

GlobalDecl -> GlobalDecl
	: Name=GlobalIdent '=' ExternLinkage Preemptionopt Visibilityopt DLLStorageClassopt ThreadLocalopt UnnamedAddropt AddrSpaceopt ExternallyInitializedopt Immutable ContentType=Type (',' Section)? (',' Comdat)? (',' Align)? Metadata=(',' MetadataAttachment)+? FuncAttrs=(',' FuncAttribute)+?
;

FuncAttribute -> FuncAttribute
	: AttrString
	| AttrPair
	# not used in attribute groups.
	| AttrGroupID
	# used in functions.
	#| Align # NOTE: removed to resolve reduce/reduce conflict, see above.
	# used in attribute groups.
	| AlignPair
	| AlignStack
	| AlignStackPair
	| AllocSize
	| FuncAttr
;

Specifically, the end of the line is of interest (',' Align)? Metadata=(',' MetadataAttachment)+? FuncAttrs=(',' FuncAttribute)+?

Given that there are no metadata attachments, the alignment attribute (align 8) of the following LLVM IR:

@a = global i32 42, align 8

may be either reduced to a global attribute (i.e. Align before MetadataAttachment), or as a function attribute (i.e. FuncAttribute after MetadataAttachment).

The solution employed by the C++ parser is the opposite of maximum much, as it will try to reduce rather than shift when possible.

Nov 22 '18 01:11 mewmew

From https://github.com/llir/llvm/issues/111#issuecomment-562429501

Grammar related to Function String Attribute

test cases failing likely related to Function String Attribute grammar

llvm/test/Bitcode/attributes.ll
- syntax error at line 266
llvm/test/Transforms/Inline/inline-varargs.ll
- syntax error at line 6

align attribute

align used in call instruction

llvm/test/Analysis/ValueTracking/memory-dereferenceable.ll
- syntax error at line 153

align used in return attribute

llvm/test/Transforms/InstCombine/assume-redundant.ll
- syntax error at line 50
llvm/test/Transforms/LoopSimplify/unreachable-loop-pred.ll
- syntax error at line 25

I don't know how to update the grammar to handle ambiguities related to align used in return attributes, function attributes, etc. The same goes for string attributes, and key-value attributes. The approach taken now is to simply allow the most common cases of these, and then (unfortunately) fail when we can't resolve the ambiguous grammar. I wish the grammar of LLVM IR was LR-1, but that does not seem to be the case.

If anyone knows of a clean approach to handle this. You are warmly invited to share your thoughts. We'd very much appreciate it, seeing as this annoying issue is yet to find a clean solution.

Cheers, Robin

Dec 16 '19 05:12 mewmew

Do we consider ANTLR? I was thinking so many issues you opened just cannot get fixed, maybe we should swap to a more stable parser generator? Anyway, it can't be more painful.

Oct 20 '22 00:10 dannypsnl

Do we consider ANTLR? I was thinking so many issues you opened just cannot get fixed, maybe we should swap to a more stable parser generator? Anyway, it can't be more painful.

@dannypsnl, haha, yeah I know, there are some pains with using Textmapper. I do think however this is true for every parser generator.

That being said, feel free to do a Proof of concept : )

I think every parser generator has pros and cons. So if ANTLR turns out to solve more issues than it creates, it may be worth it. However, it should be noted that this is a ton of work. So, except to put in at least 2 full time weeks before reaching feature parity. If you still feel like working on it, then definitely, go for it!

Cheers, Robin

Oct 20 '22 21:10 mewmew

llvm llvm copied to clipboard

Ambiguity in grammar for parsing alignment attributes, string attributes

Grammar related to Function String Attribute

align attribute

llvm
llvm copied to clipboard