rust-bindgen icon indicating copy to clipboard operation
rust-bindgen copied to clipboard

Feature request: use DST to represent structures with flexible array members

Open jsgf opened this issue 1 year ago • 7 comments

Input C/C++ Header

struct msg {
    unsigned int tag;
    unsigned int len;
    char payload[];
};

Bindgen Invocation

$ bindgen input.h

Actual Results

This produces an output using __IncompleteArrayField as expected:

#[repr(C)]
#[derive(Debug)]
pub struct msg {
    pub tag: ::std::os::raw::c_uint,
    pub len: ::std::os::raw::c_uint,
    pub payload: __IncompleteArrayField<::std::os::raw::c_char>,
}

Expected Results

It would be nice to have an option to model this by using a structure with a dynamic tail:

#[repr(C)]
#[derive(Debug)]
pub struct msg {
    pub tag: ::std::os::raw::c_uint,
    pub len: ::std::os::raw::c_uint,
    pub payload: [:std::os::raw::c_char],
}

These types of structures are awkward to use on their own, so it would need some helper methods:

impl msg {
    // Size and alignment so memory can be allocated
    fn layout(len: usize) -> ::std::alloc::Layout {
        unsafe {
            let p: *const Self = ::std::ptr::from_raw_parts(ptr::null(), len);
            ::std::alloc::Layout::for_value_raw(p)
        }
    }
    // Convert a raw pointer into an immutable view for a given number of payload elements.
    // Danger: Return lifetime unconstrained
    unsafe fn from_ptr<'a>(ptr: *const (), len: usize) -> &'a Self {
        let ptr: *const Self = ::std::ptr::from_raw_parts(ptr, len);
        &*ptr
    }
    // Same but mutable, and maybe uninitialized, to to allow Rust to initialize/mutate it.
    // Danger: Return lifetime unconstrained
    unsafe fn from_ptr_mut<'a>(
        ptr: *mut (),
        len: usize,
    ) -> ::std::mem::MaybeUninit<&'a mut Self> {
        let ptr: *mut Self = ::std::ptr::from_raw_parts_mut(ptr, len);
        ::std::mem::MaybeUninit::new(&mut *ptr)
    }
}

These methods would currently depend on the unstable ptr_metadata and layout_for_ptr features.

jsgf avatar Feb 26 '24 00:02 jsgf

Prototype implementation in #2772

jsgf avatar Feb 26 '24 00:02 jsgf

Hmm, this seems fine, but maybe we should just make the implementation of __IncompleteArrayField use them or so? Would that not be feasible?

emilio avatar Feb 26 '24 12:02 emilio

That's a possibility but there's a few downsides:

  • It currently relies on nightly features, and I don't know what their stabilization path is
  • It's a pretty large back-compat break

Specifically for the second, since these become DST types, it means it's no longer possible to directly reference them by value, so you wouldn't be able to do something like:

let hdr = msg { ... set up fields ...};
// (something to set up payload)
*ptr = hdr;

That's not the end of the world of course - that's why from_ptr_mut exists.

The other problem is that all pointers and references to these objects will be fat, and so not C-ABI compatible. So something like:

struct controlblock {
    struct msg *messagequeue;
};

could not be directly converted to:

#[repr(C)]
struct controlblock {
    messagequeue: *mut msg,
}

Currently this proposed API is using *mut/const () as the thin placeholder, but we could define associated types for typed thin variants, eg:

#[repr(C)]
struct controlblock {
    messagequeue: *mut msg::Thin,
}

Likewise you wouldn't be able to directly pass or return the fat pointers to/from C-ABI functions.

I don't think any of these are show-stoppers; structures with flex array have always been a bit awkward to deal with, and I think this API is an accurate reflection of that (ie, it would make it harder to get things wrong). If we could wave a magic wand to stabilize the features and update all existing code to the new API, then I think we could replace __IncompleteArrayField entirely.

jsgf avatar Feb 26 '24 17:02 jsgf

Yeah to be clear I meant to make __IncompleteArrayField optionally a DST, with the same-ish API to consumers, and we'd keep the current version as default at least for a bit. But yeah I guess you'd need from_ptr and such for the new version anyways so maybe it's not such a great idea.

emilio avatar Feb 26 '24 17:02 emilio

BTW I also experimented with this idea using a trait (https://play.rust-lang.org/?version=nightly&mode=release&edition=2021&gist=114e0ca2cbc10523037946a7d9c61a06) , since it's a fairly self-contained abstraction that you might want to generalize over.

But as far as I know bindgen doesn't have a support library crate where this could live, and it doesn't seem like it makes sense to emit a separate variant of this trait for each bindgenned module - that would be no better than the current proposal just using intrinsic methods.

jsgf avatar Feb 26 '24 17:02 jsgf

I've published PR #2772 with a fairly complete implementation of this idea.

jsgf avatar Mar 07 '24 00:03 jsgf

I think #2772 is ready for review.

jsgf avatar Mar 14 '24 02:03 jsgf