miri icon indicating copy to clipboard operation
miri copied to clipboard

Miri preserves padding and partial initialization on copies

Open RalfJung opened this issue 5 years ago • 4 comments

Miri should be able to detect that the following is UB because it prints uninitialized memory:

use std::mem;

#[repr(C)]
struct Pair(u8, u16);

fn main() { unsafe {
    let p: Pair = mem::transmute(0u32); // The copy when `Pair` is returned from `transmute` should destroy padding.
    let c = &p as *const _ as *const u8;
    println!("{}", *c.add(1)); // Print the padding byte.
} }

However, currently assignment is just implemented as an untyped memcpy, so we incorrectly preserve padding.

RalfJung avatar Jul 17 '19 06:07 RalfJung

A related issue is that when copying, e.g., [usize; 3], Miri will not perform a "typed copy": a typed copy would make the entire array element "undefined" if any of its bytes is undefined, whereas Miri will just copy the underlying bytes. Also see this stackoverflow question.

RalfJung avatar Apr 15 '20 10:04 RalfJung

No GH I did not want you to close this issue...

RalfJung avatar Jun 06 '22 16:06 RalfJung

Another example where Miri may be missing UB. I was attempting to use Miri to remind myself of how to safely work with libc style socket unions, and mistakenly (?) convinced myself that that this is sound:

#[repr(C)]
#[derive(Debug, Copy, Clone)]
struct Foo {
    val16: u16,
    // Padding bytes go here!
    val32: u32,
}

#[repr(C)]
#[derive(Debug, Copy, Clone)]
struct Bar {
    bytes: [u8; 8],
}

union FooBar {
    foo: Foo,
    bar: Bar,
}

pub fn main() {
    // Initialize as u8 to ensure padding bytes are zeroed.
    let mut foobar = FooBar { bar: Bar { bytes: [0u8; 8] } };
    // Reading either field is ok.
    println!("{:?} {:?}", unsafe { foobar.foo }, unsafe { foobar.bar });
    // Does this assignment copy the uninitialized padding bytes
    // over the initialized padding bytes? miri doesn't seem to think so.
    foobar.foo = Foo { val16: 1, val32: 2 };
    // Still OK to read.
    println!("{:?} {:?}", unsafe { foobar.foo }, unsafe { foobar.bar });
}

IIUC the line foobar.foo = Foo { val16: 1, val32: 2 }; is allowed to overwrite the padding bytes in foobar.foo with undefined values, but no UB is detected when those bytes are read on the following line.

Unsurprisingly (?) miri does detect UB if we assign via a bytewise copy:

    let mut foobar = FooBar { bar: Bar { bytes: [0u8; 8] } };
    // OTOH if we do the Rust equivalent of a memcpy instead of assignment,
    // the undefined padding bytes from the source definitely copied to the destination.
    unsafe { std::ptr::copy_nonoverlapping(&Foo { val16: 1, val32: 2 }, &mut foobar.foo, 1) };
    // UB reading the uninitd padding bytes
    println!("{:?} {:?}", unsafe { foobar.foo }, unsafe { foobar.bar });

sporksmith avatar Nov 17 '22 22:11 sporksmith

IIUC the line foobar.foo = Foo { val16: 1, val32: 2 }; is allowed to overwrite the padding bytes in foobar.foo with undefined values, but no UB is detected when those bytes are read on the following line.

Yes that sounds correct. This will be hard to do even after the basics of this got implemented, since AFAIK MIR does not even preserve this as a regular assignment... so even if regular assignments reset padding, this would still fail to report UB. @oli-obk would know better what the status is of deaggregation and having a Finalize statement that we could use for erasing padding (which seems better than the alternative of running Miri on non-deaggregated MIR).

RalfJung avatar Nov 18 '22 14:11 RalfJung