rust-bindgen icon indicating copy to clipboard operation
rust-bindgen copied to clipboard

enum: add support hash name for unnamed enum

Open qinghon opened this issue 4 months ago • 9 comments

Occasionally, conflicts between header files necessitate running Bindgen separately on each and then merging the resulting bindings. Present bindgen_ty_{number} identifiers generated for unnamed enums cause naming conflicts.

This PR introduces support for hash-based naming for unnamed enums.

qinghon avatar Aug 03 '25 04:08 qinghon

example:

enum {                                                                                                                                                                                          
                                                                                                                                                                                                
        A = 1,                                                                                                                                                                                  
        B = 2                                                                                                                                                                                   
};                                                                                                                                                                                              
enum {                                                                                                                                                                                          
        C = -3,                                                                                                                                                                                 
        D = 4,                                                                                                                                                                                  
}; 
/* automatically generated by rust-bindgen 0.72.0 */

pub const A: bindgen_enum_98a4e4a38f13856d = 1;
pub const B: bindgen_enum_98a4e4a38f13856d = 2;
pub type bindgen_enum_98a4e4a38f13856d = ::std::os::raw::c_uint;
pub const C: bindgen_enum_bb1a5d36cda6e84c = -3;
pub const D: bindgen_enum_bb1a5d36cda6e84c = 4;
pub type bindgen_enum_bb1a5d36cda6e84c = ::std::os::raw::c_int;

qinghon avatar Aug 03 '25 04:08 qinghon

Occasionally, conflicts between header files necessitate running Bindgen separately on each and then merging the resulting bindings.

Could you please expand on the use case?

ojeda avatar Aug 03 '25 06:08 ojeda

@ojeda thank remind

qinghon avatar Aug 03 '25 16:08 qinghon

I wasn't looking at the changes (but, of course, I imagine adding tests and documentation is important for bindgen) -- what I meant by my question above is that it isn't very clear what the use case is.

In other words, why do you need this? e.g. is this about including different generated bindgen files into a single Rust module or similar? Or something else? i.e. what is the problem being solved?

Thanks!

ojeda avatar Aug 03 '25 16:08 ojeda

@ojeda I'm working on a library (dlibc) to generate bindings for all system libc header files into a single crate.

A major problem I've encountered is:

  1. If all are included in a single wrapper.h file, symbol conflicts between header files occur, leading Clang to complain and making it impossible to proceed.
  2. If bindgen is run separately for each header, the bindgen_ty_{number} identifiers generated by bindgen for unnamed enums cause conflicts during merging.

This problem also extends to common use cases: similar issues arise when running bindgen on two header files that have slight overlaps (if their generated bindings need to be merged).

edit: there is also anonymous union and struct that requires hash name, but I am not very familiar with bindgen and have not encountered it yet, so I only made enum.

qinghon avatar Aug 04 '25 03:08 qinghon

If all are included in a single wrapper.h file, symbol conflicts between header files occur, leading Clang to complain and making it impossible to proceed.

Do you mean those C headers cannot be included all together? That sounds a bit strange (it can happen, of course, but normally one would design public headers to avoid that) -- do you have an example? Thanks!

ojeda avatar Aug 04 '25 16:08 ojeda

just simple case

#include <time.h>
#include <linux/time.h>

qinghon avatar Aug 04 '25 16:08 qinghon

Thanks -- so it is indeed about the case where the C compiler would complain, i.e. this is before bindgen, so I would say that is not a use case for the feature here, which leaves us with number 2.

For number 2, I assume you are including the output of bindgen in a single Rust module? If so, is there a particular reason why you cannot put it in different ones, or it is simply that it is convenient to have it in a single namespace? An example for that use case would also be great.

ojeda avatar Aug 04 '25 17:08 ojeda

@ojeda

When using bindgen separately to generate and use bindings directly, the following issues may arise:

1. A large number of duplicate symbols (this isn’t too serious, just a matter of time).
2. The same struct defined in different files actually becomes a different type.

The second issue is particularly severe, as it leads to a large amount of conversion code, making it essentially unusable for humans.

To solve this problem, the shared parts need to be extracted into a separate module and imported. This requires that item names remain stable when running bindgen on different header file entry points.

PS: In fact, the best approach is to keep the structure consistent with the C header files. Therefore, keeping enum names stable in this PR is only part of the series; we also need to split the generated bindings into paths corresponding to the C files when running bindgen. I’ll submit a separate PR for that.

// bindgen /usr/include/stdio.h > stdio.rs
mod stdio {
    include!("stdio.rs");
}
// bindgen /usr/include/wchar.h > wchar.rs
mod wchar {
    include!("wchar.rs");
}

use stdio::{stdout, fflush};
use wchar::FILE;

fn main() {

    let fname = std::ffi::CString::new("tmp.txt").unwrap();
    let mode  = std::ffi::CString::new("w").unwrap();
    let f: *mut stdio::FILE = unsafe {
        stdio::fopen(fname.as_ptr(), mode.as_ptr())
    };

    let ws: [wchar::wchar_t; 5] = [0x4E2D, 0x6587, 0x0020, 0x0066, 0x0069];

    let n = unsafe {
        wchar::fputws(ws.as_ptr(), f)   // ❌
//      -------------              ^ expected `wchar::_IO_FILE`, found `stdio::_IO_FILE`
//      arguments to this function are incorrect
    };

}

qinghon avatar Aug 05 '25 03:08 qinghon