rust-bindgen icon indicating copy to clipboard operation
rust-bindgen copied to clipboard

How to avoid bindgen being out of sync with cc?

Open Sympatron opened this issue 1 year ago • 4 comments

My understanding is, that it is very common to wrap a C library in a Rust *-sys crate by using bindgen to automatically generate Rust bindings and cc to compile the C code.

As far as I understand it bindgen uses libclang to parse the C headers and cc uses whatever compiler the user provides.

This can lead to problems, because both toolchains don't need to agree on everything. While cross compiling for thumbv6m-none-eabi on Windows, bindgen generated u32 for enums and cc used u8 when possible (-fshort-enums).

This is quite unfortunate and hard to catch. Is there a common workaround for this?

PS: If there is a better place for this issue please tell me.

Sympatron avatar Oct 25 '24 15:10 Sympatron

the easiest I can think of is to set the relevant environment variables for clang-sys and use the same paths when calling Build::compiler

The docs for Build::compiler also say that the compiler is automatically detected from a number of environment variables so it might be that setting CC or similar to be consistent with the clang-sys environment variables might work as well.

pvdrz avatar Oct 25 '24 16:10 pvdrz

I also ran into the problem recently that I had generated bindings that were incorrect because of enum size differences; similarly on a thumbX-none-eabi project, although in my case I wasn't using cc but instead linking with separately compiled code as a separate step from the cargo build.

I had this idea for a "this would've prevented my mistake" feature: when bindgen encounters an enum, if it detects that the target platform is one that common C compilers disagree on the size of enums[^1] and you haven't manually specified -fshort-enums or -fno-short-enums it could issue a warning to double check that the bindings are correct.

A warning seemed like the most useful UX here because, at least for my use case, bindgen couldn't know that I was planning to link with code compiled by gcc, but the warning would've prompted me to figure out that the bindings were wrong before I learned so the hard way.

[^1]: likely most critically: -none-eabi arm platforms

geeklint avatar Jan 16 '25 02:01 geeklint

likely most critically: -none-eabi arm platforms

ARM defers enum ABI choice to platform ABI, but of course *-none-* has no platform ABI. As @geeklint discovered (see: godbolt & Rust GameDev Discord Discussion), GCC chooses a 1-byte repr, Clang chooses a 4-byte repr, for a simple:

typedef enum { Hello = 1 } Test;

Solutions that have run through my mind include:

  1. Making bindgen (optionally?) generate a bindings.cpp to feed to cc full of static_asserts validating alignment/size/signedness/???. Probably the best option for catching differences between bindgen and cc, although it doesn't necessairly do anything to fix caught issues. I've done similar for manually generated FFI and it's helped.

  2. For enums specifically, making a miserable pile of cc-driven types and using that to implement bindings.rs instead of core::ffi::c_int etc. - I implemented the first 90% of this as abienum, although the build.rs probably needs to export metadata to support #[repr(u32)] enum Test { ... } style enums. The second 90%, actually modifying bindgen to use something like abienum, is left as an exercise to the reader. The third 90% has yet to be identified. This also doesn't help if you don't specify ${CC} to tell cc to use GCC when linking prebuilt libs that were built with GCC.

  3. Trying to upstream core::ffi::c_enum_* somehow. This would require making rustc aware of ${CC} / ${CFLAGS} / ???, since compilers for the "same" target disagree on layout, which seems like something that would have a lot of pushback (although I could be wrong.)

  4. Taint generated enums such that they're improper_ctypes somehow, at least on unknown/none style platforms.

MaulingMonkey avatar Jan 19 '25 15:01 MaulingMonkey

I just finished debugging extremely naughty issue because GCC generated enum of size 2 while bindgen of size 4. The behavior isn't consistent either, in the same build, GCC picked size 4 for relatively similar enums. -fno-short-enums is a solution too.

That suggestion with static asserts compiled with GCC would've helped and prevented the issue.

Ddystopia avatar Nov 04 '25 14:11 Ddystopia