llvm
llvm copied to clipboard
Ambiguity in grammar for parsing alignment attributes, string attributes
The grammar contains an ambiguity when parsing global variable alignment attributes. More specifically, an alignment attribute of a global variable may be interpreted either as a GlobalAttr or a FuncAttr, and since the list of both global attributes and function attributes may be optionally empty, this leads to a shift/reduce ambiguity in the parser.
From the ll.tm EBNF grammar:
GlobalDecl -> GlobalDecl
: Name=GlobalIdent '=' ExternLinkage Preemptionopt Visibilityopt DLLStorageClassopt ThreadLocalopt UnnamedAddropt AddrSpaceopt ExternallyInitializedopt Immutable ContentType=Type (',' Section)? (',' Comdat)? (',' Align)? Metadata=(',' MetadataAttachment)+? FuncAttrs=(',' FuncAttribute)+?
;
FuncAttribute -> FuncAttribute
: AttrString
| AttrPair
# not used in attribute groups.
| AttrGroupID
# used in functions.
#| Align # NOTE: removed to resolve reduce/reduce conflict, see above.
# used in attribute groups.
| AlignPair
| AlignStack
| AlignStackPair
| AllocSize
| FuncAttr
;
Specifically, the end of the line is of interest (',' Align)? Metadata=(',' MetadataAttachment)+? FuncAttrs=(',' FuncAttribute)+?
Given that there are no metadata attachments, the alignment attribute (align 8) of the following LLVM IR:
@a = global i32 42, align 8
may be either reduced to a global attribute (i.e. Align before MetadataAttachment), or as a function attribute (i.e. FuncAttribute after MetadataAttachment).
The solution employed by the C++ parser is the opposite of maximum much, as it will try to reduce rather than shift when possible.
From https://github.com/llir/llvm/issues/111#issuecomment-562429501
Grammar related to Function String Attribute
test cases failing likely related to Function String Attribute grammar
align attribute
align used in call instruction
llvm/test/Analysis/ValueTracking/memory-dereferenceable.ll- syntax error at line 153
align used in return attribute
llvm/test/Transforms/InstCombine/assume-redundant.ll- syntax error at line 50
llvm/test/Transforms/LoopSimplify/unreachable-loop-pred.ll- syntax error at line 25
I don't know how to update the grammar to handle ambiguities related to align used in return attributes, function attributes, etc. The same goes for string attributes, and key-value attributes. The approach taken now is to simply allow the most common cases of these, and then (unfortunately) fail when we can't resolve the ambiguous grammar. I wish the grammar of LLVM IR was LR-1, but that does not seem to be the case.
If anyone knows of a clean approach to handle this. You are warmly invited to share your thoughts. We'd very much appreciate it, seeing as this annoying issue is yet to find a clean solution.
Cheers, Robin
Do we consider ANTLR? I was thinking so many issues you opened just cannot get fixed, maybe we should swap to a more stable parser generator? Anyway, it can't be more painful.
Do we consider ANTLR? I was thinking so many issues you opened just cannot get fixed, maybe we should swap to a more stable parser generator? Anyway, it can't be more painful.
@dannypsnl, haha, yeah I know, there are some pains with using Textmapper. I do think however this is true for every parser generator.
That being said, feel free to do a Proof of concept : )
I think every parser generator has pros and cons. So if ANTLR turns out to solve more issues than it creates, it may be worth it. However, it should be noted that this is a ton of work. So, except to put in at least 2 full time weeks before reaching feature parity. If you still feel like working on it, then definitely, go for it!
Cheers, Robin