zig
zig copied to clipboard
cImport / translate-c picks the wrong C type when parsing integer literals
Zig Version
0.12.0-dev.3686+3adfaf91f
Steps to Reproduce and Observed Behavior
When importing this C header:
const loader = @cImport(@cInclude("mach-o/loader.h"));
using loader.LC_LOAD_WEAK_DYLIB
leads to this compile error:
cimport.zig:5499:51: error: type 'c_int' cannot represent integer value '2147483648'
pub const LC_LOAD_WEAK_DYLIB = @as(c_int, 0x18) | LC_REQ_DYLD;
where LC_REQ_DYLD
is originally #define LC_REQ_DYLD 0x80000000
which doesn't fit in a 32-bit signed integer.
The manual translation of this constant in macho.zig in the Zig std library is just a comptime int, and that + the other constants work fine. However, the macho.zig file is incomplete so I need to use the header directly, which exposes this issue.
translate-c issue
translate_c#parseCNumLit
just picks c_int
for unsuffixed integer literals, which leads to an error if the literal doesn't fit. It seems like the handling of suffixed literals also don't pick a larger type if necessary.
I believe this code should follow C std 6.4.4.1p5, which specifies the following:
The type of an integer constant is the first of the corresponding list in which its value can be represented.
Suffix | Decimal Constant | Octal, Hexadecimal or Binary (since C23) Constant |
---|---|---|
none | int long int long long int |
int unsigned int long int unsigned long int long long int unsigned long long int |
u or U | unsigned int unsigned long int unsigned long long int |
unsigned int unsigned long int unsigned long long int |
l or L | long int long long int |
long int unsigned long int long long int unsigned long long int |
Both u or U and l or L |
unsigned long int unsigned long long int |
unsigned long int unsigned long long int |
ll or LL | long long int | long long int unsigned long long int |
Both u or U and ll or LL |
unsigned long long int | unsigned long long int |
Note: If an integer constant cannot be represented by any type in its list, it may have an extended integer type, if the extended integer type can represent its value. If all of the types in the list for the constant are signed, the extended integer type shall be signed. If all of the types in the list for the constant are unsigned, the extended integer type shall be unsigned. If the list contains both signed and unsigned types, the extended integer type may be signed or unsigned.
I implemented this in cParseCNumLit
, and LC_REQ_DYLD
now becomes a c_uint
instead of c_int
. However, this isn't enough since the constants that derive their value from LC_REQ_DYLD
are still c_int
- thus you get the same compiler error. The type conversion must somehow propagate.
Related c_translation.zig issue and problematic fix?
See #11409 for a similar problem. This refers to a comment where yet another similar problem received a fix in #12695
This fix adds c_ulonglong
to the list of signed candidate types. This seems to be at odds with C std 6.4.4.1p5 which, for instance, requires the extension type to be signed for unsuffixed signed decimal constants, per the table note above. Could i128 be the implementation-defined extension type in this case? This is also the type used when parsing literals.
Expected Behavior
Correct type usage per the C standard and no Zig compiler error on importing valid C headers
This fix adds c_ulonglong to the list of signed candidate types. This seems to be at odds with C std 6.4.4.1p5 which, for instance, requires the extension type to be signed for unsuffixed signed decimal constants, per the table note above.
clang and gcc differ here - GCC follows the standard and promotes to __int128; clang converts it to unsigned long long. Since translate-c is based on clang we follow the clang behavior: https://godbolt.org/z/W75c6WGjY
Currently aro follows the clang behavior but we should update it to be able to do either depending on what compiler is being emulated.
clang and gcc differ here - GCC follows the standard and promotes to __int128; clang converts it to unsigned long long. Since translate-c is based on clang we follow the clang behavior: https://godbolt.org/z/W75c6WGjY
Currently aro follows the clang behavior but we should update it to be able to do either depending on what compiler is being emulated.
Thanks for the insights. IMO, Zig should definitely not perpetuate behavior like converting unsuffixed decimal constants to ULL. Moreover, it seems that translate-c doesn't promote to a larger type at all in parseCNumLit
(though I'm not that familiar with the code base so could be missing something), while c_translate goes the non-compliant clang route in promoteIntLiteral
.