pyclibrary C identifiers with 2 leading underscores fail to parse correctly

Identifiers with 2 leading underscores fail to parse correctly.

Library versions:

pyparsing 2.4.6 pyclibrary 0.1.7

Examples:

int fail(unsigned __flags); int success(unsigned _flags);

struct success_s { struct { int aaa; } _success; };

struct fail_s { struct { int bbb; } __fail; } fail_t;

struct success2_s { int _ccc; };

struct fail2_s { int __ddd; };

Parsed results from above example (removed macros and empty sections):

============== types ================== { 'struct anon_struct0': Type('struct', 'anon_struct0'), 'struct anon_struct1': Type('struct', 'anon_struct1'), 'struct anon_struct2': Type('struct', 'anon_struct2'), 'struct anon_struct3': Type('struct', 'anon_struct3'), 'struct anon_struct4': Type('struct', 'anon_struct4'), 'struct anon_struct5': Type('struct', 'anon_struct5'), 'struct fail2_s': Type('struct', 'fail2_s'), 'struct fail_s': Type('struct', 'fail_s'), 'struct success2_s': Type('struct', 'success2_s'), 'struct success_s': Type('struct', 'success_s')} ============== variables ================== {'bbb': (None, Type('int'))} ============== structs ================== { 'anon_struct0': Struct(('aaa', Type('int'), None)), 'anon_struct1': Struct(('bbb', Type('int'), None)), 'anon_struct2': Struct(('bbb', Type('int'), None)), 'anon_struct3': Struct(('aaa', Type('int'), None)), 'anon_struct4': Struct(('bbb', Type('int'), None)), 'anon_struct5': Struct(('bbb', Type('int'), None)), 'fail2_s': Struct(), 'fail_s': Struct(), 'success2_s': Struct(('_ccc', Type('int'), None)), 'success_s': Struct(('_success', Type('struct anon_struct3'), None))} ============== functions ================== { 'fail': Type(Type('int'), ((None, Type('unsigned', type_quals=(('__',),)), None),)), 'success': Type(Type('int'), (('_flags', Type('unsigned'), None),))} ============== values ================== {'bbb': None}

Failed parsing from above examples:

struct fail_s { struct { } __fail; } struct fail2_s { int __ddd; };

Sep 09 '21 14:09 puredrivel

Looking quickly it appears the parser believe it is dealing with a function qualifier rather than an identifier. Honestly I won't have the bandwidth to look in more details soon. However if you can put together a PR I will do my best to review. Otherwise feel free to ping me again in a couple of weeks it may be better then.

Sep 09 '21 14:09 MatthieuDartiailh

I think you foresaw this issue...

# Removes '__name' from all type specs. may cause trouble.
underscore_2_ident = (WordStart(wordchars) + ~keyword + '__' +
                      Word(alphanums, alphanums+"_$") +
                      WordEnd(wordchars)).setParseAction(lambda t: t[0])
type_qualifier = ZeroOrMore((underscore_2_ident + Optional(nestedExpr())) |
                            kwl(qualifiers))

Removing this fixes the issue. I'm assuming it's here for a reason though. If you have a second to comment on the logic behind this, I'll see if I can work on a fix based on it's intended purpose.

Sep 09 '21 20:09 puredrivel

Actually the parser largely predates me taking over the maintenance of the project. However I believe this is here to handle the kind of things discussed in https://stackoverflow.com/questions/1449181/what-does-double-underscore-const-mean-in-c for example. Having read that in what context did you find double underscore name in a header you need to use ?

Sep 10 '21 15:09 MatthieuDartiailh

pyclibrary pyclibrary copied to clipboard

C identifiers with 2 leading underscores fail to parse correctly

Library versions:

Examples:

Parsed results from above example (removed macros and empty sections):

Failed parsing from above examples:

pyclibrary
pyclibrary copied to clipboard