uefi-raw: Add explicit padding field for MemoryDescriptor
~~IMHO, having implicit padding bytes in Low-Level types (used for Casting etc) is a bad idea. When you cast a &[MemoryDescriptor] to &[u8], Miri complains about uninitialized bytes.~~
~~Exporting one (or multiple) public _pad0 fields however is also not nice. So, we should identify all structs where this is relevant and come up with a solution. Perhaps, lets move to constructor-based types with private fields? This is breaking, how-ever.~~
By convention, structs having a C representation should be explicit with their in-between padding fields. This allows users memory-safe trivial serialization and deserialization by accessing the underlying memory as byte slice. Without this fields being explicit, Miri complains about uninitialized memory accesses, which is UB.
Checklist
- [ ] Sensible git history (for example, squash "typo" or "fix" commits). See the Rewriting History guide for help.
- [ ] Update the changelog (if necessary)
When you cast a &[MemoryDescriptor] to &[u8], Miri complains about uninitialized bytes.
I'd like to understand this better -- why is this cast being performed? In general we can't make the problems of uninit bytes go away by adding explicit padding fields, because our types often go over FFI boundaries and we don't control the C definition.
If you do need a "raw bytes" view of a type that may contain uninit bytes, use &[MaybeUnint].
I've updated the PR description. Please re-check and let me know what you think.
I don't get this check. What is its purpose? Why do we have it? @nicholasbishop
I don't get this check. What is its purpose?
The check-raw code assumes nothing is allowed by default, then adds various exceptions for code we do want to allow. Rust is a big language, and we should be very intentional about any new types of code being added to uefi-raw since it's intended to hew closely to the C API. For this particular use case std isn't needed, anything that reads each byte of raw_bytes will work, e.g. let _ = raw_bytes.iter().fold(0, |acc, v| acc + v).
By convention, structs having a C representation should be explicit with their in-between padding fields.
This is debatable I think. For example, bindgen has an option to enable explicit padding but its not enabled by default.
MemoryDescriptor actually strikes me as a place where explicit padding is less useful, because the actual size of MemoryDescriptor can in general only be known at runtime by checking the desc_size returned from get_memory_map. In effect, there may be implicit padding between each element.
In general, adding padding bytes can indeed make serialization and deserialization easier, but there are a couple cons to consider:
- The type's fields can become cluttered. Perhaps not such a big deal in this one case, but if we apply it everywhere for UEFI-ABI-compat types, some structs are going to have many padding fields.
- If the type is constructed externally (e.g. by UEFI firmware), the padding bytes may not be initialized anyway, and so we're just kind of tricking Miri by making the padding bytes explicit in tests, but not helping with real-world usage.
I think it would really help to have an example of the code where you want this so we can discuss whether alternatives might work acceptably well. For example, perhaps there's something we can do with MaybeUninit<u8> that would work.
The type's fields can become cluttered. Perhaps not such a big deal in this one case, but if we apply it everywhere for UEFI-ABI-compat types, some structs are going to have many padding fields.
agree
I think it would really help to have an example of the code where you want this
Something like that
trait ToRawBytes {
fn raw_bytes(&self) -> &[u8] {
let ptr = core::ptr::addr_of(*self);
let len = core::mem::size_of::<Self>();
unsafe { core::slice::from_raw_parts(ptr.cast(), len) };
}
}
How do we want to continue here? I don't have a strong opinion
Thanks, the example is helpful. I think the answer is that this just isn't possible to do soundly if the struct has padding bytes. Explicitly defining fields to replace those padding bytes only helps if the type is initialized by Rust, but for most UEFI types we have the possibility that it's created by the firmware. We cannot assume that padding bytes are initialized or preserved across the FFI boundary.
The safety analysis section of the zerocopy crate's IntoBytes trait is a helpful guide on what can be done soundly. I've occasionally wondered if perhaps we should make use of zerocopy, since they have done extensive work on carefully proving the safety of their API, and their derive macros can automatically check the types and derive only the conversion traits that are sound. I haven't actually prototyped adding zerocopy, but I think it could be a good direction to investigate for these low-level conversions to/from raw bytes.