rust-bindgen icon indicating copy to clipboard operation
rust-bindgen copied to clipboard

Replacing native C types (Windows & Mac vs Linux)

Open Wodann opened this issue 5 years ago • 2 comments

Bindgen nicely replaces occurrences of native C types with Rust types. For example:

#include <stdint.h>

typedef struct
{
    const uint8_t b[16];
} MunGuid;

results in:

#[repr(C)]
#[derive(Clone, Copy)]
pub struct Guid {
    pub b: [u8; 16usize],
}

It works. Lovely! One problem, once we run CI, it turns out that Ubuntu also generates this additional line

pub type __uint8_t = ::std::os::raw::c_uchar;

Even though bindgen is able to deduce that uint8_t is supposed to be u8, it still adds the __uint8_t type to the generated bindings.

One possible solution that I came up with to resolved this is to blacklist the __uint8_t type, but then bindgen stops automatically generating the #[derive(Clone, Copy)] attributes.

Is there a way to prevent the generation of "replaced" types?

Wodann avatar Nov 01 '19 15:11 Wodann

Other code in standard library (or user code, really) may actually be using __uint8_t type.

I would have expected this to work if you run this with --whitelist-type MunGuid, but it doesn't and it also outputs the __uint8_t type.

So I think the main issue is that the edge from uint8_t to __uint8_t is accounted for, even though bindgen manually avoids that edge during code generation.

So basically something like this line:

https://github.com/rust-lang/rust-bindgen/blob/79b2b1042069ca6673afc706fca2207d715f6894/src/ir/ty.rs#L1233

May be able to check its name and avoid tracing the inner type in that case. I don't know how that works perf-wise, would need some measuring.

emilio avatar Nov 03 '19 10:11 emilio

I got most of the way through the new issue template before noticing this seems to be the same issue I was experiencing. So I'm pasting that here. Hopefully the reduced testcase is useful to someone.

Input C/C++ Header

example.h

typedef unsigned int __uint32_t;
typedef __uint32_t uint32_t;

typedef struct Foo { uint32_t id; } Foo;

Bindgen Invocation

$ bindgen example.h --blocklist-type '^__.*'

Actual Results

/* automatically generated by rust-bindgen 0.59.2 */

#[repr(C)]
pub struct Foo {
    pub id: u32,
}
// Layout test elided

Expected Results

I expected the struct to have Clone, and Copy derived. If I run bindgen example.h I get the following:

/* automatically generated by rust-bindgen 0.59.2 */

pub type __uint32_t = ::std::os::raw::c_uint;
#[repr(C)]
#[derive(Debug, Copy, Clone)]
pub struct Foo {
    pub id: u32,
}

These typedefs are based on what was in stdint.h on a Linux system

typedef unsigned int __uint32_t;
typedef __uint32_t uint32_t;

This is related to https://github.com/rust-lang/rust-bindgen/issues/1215 but following one of the suggestions there to blocklist the undesired types causes Copy to not be derived, so I need to choose between having non-Copy types or an extra unneeded type.

Ryan1729 avatar Mar 08 '22 06:03 Ryan1729

#2287 made me realize about all this and I wonder if we could fix this from the native types side.

What I mean is that we could remove all this logic about special-casing stdint.h names from codegen and use the size reported by clang for each native integer type to match them to their rust counterpart. If c_int has size 32 according to clang, then we use i32. I would put this behind a flag because people interested in maintaining a certain level of portability would still want to keep the c_* types in their generated bindings.

We could also remove redundant aliases from inside the codegen::postprocessing module based on some special cases like u32 being aliased as uint32_t or any of its mangled versions.

so at the end of the day, bindgen would process these headers

typedef unsigned char uint64_t;
typedef unsigned int uint32_t;

uint64_t foo();
uint32_t bar();

and, with the flag disabled, emit this:

type uint64_t = c_uchar;
type uint32_t = c_uint;

extern "C" {
    pub fn foo() -> uint64_t;
    pub fn bar() -> uint32_t;
}

with the flag enabled, emit this:

type uint64_t = u8;

extern "C" {
    pub fn foo() -> uint64_t;
    pub fn bar() -> u32;
}

instead of the current behaviour which is this:

extern "C" {
    pub fn foo() -> u64;
    pub fn bar() -> u32;
}

Maybe I'm being to naive by assuming that the size reported by clang is correct in this context 😅 but we could add extra layout tests for that just in case.

pvdrz avatar Sep 28 '22 18:09 pvdrz