c2rust icon indicating copy to clipboard operation
c2rust copied to clipboard

`c2rust` introduces UB in `&a[i]` by adding a `*` deref and `&` ref when there should only be an `.offset`

Open jrmuizel opened this issue 5 years ago • 2 comments

Another issue I found when running miri on the generated code.

#include <stdlib.h>
struct Foo {
    int *data_ptr;
    int data[];
};

void f() {
    struct Foo *foo = malloc(sizeof(struct Foo));
    foo->data_ptr = &foo->data[0];
}

int main() {
    f();
}

translates to

extern "C" {
    #[no_mangle]
    fn malloc(_: libc::c_ulong) -> *mut libc::c_void;
}
#[derive(Copy, Clone)]
#[repr(C)]
pub struct Foo {
    pub data_ptr: *mut libc::c_int,
    pub data: [libc::c_int; 0],
}
#[no_mangle]
pub unsafe extern "C" fn f() {
    let mut foo: *mut Foo =
        malloc(::std::mem::size_of::<Foo>() as libc::c_ulong) as *mut Foo;
    (*foo).data_ptr =
        &mut *(*foo).data.as_mut_ptr().offset(0 as libc::c_int as isize) as
            *mut libc::c_int;
}
unsafe fn main_0() -> libc::c_int { f(); return 0; }
#[main]
pub fn main() { unsafe { ::std::process::exit(main_0() as i32) } }
error: Undefined Behavior: memory access failed: pointer must be in-bounds at offset 12, but is outside bounds of alloc2017 which has size 8
  --> src/main.rs:20:9
   |
20 |         &mut *(*foo).data.as_mut_ptr().offset(0 as libc::c_int as isize) as
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ memory access failed: pointer must be in-bounds at offset 12, but is outside bounds of alloc2017 which has size 8
   |
   = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
   = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
           

The &mut *() dereferences the data member which allowed in C but not in Rust. Removing/folding the &mut * fixes the problem.

jrmuizel avatar Sep 18 '20 17:09 jrmuizel

The variable-length data field is also mistranslated. It should be [libc::c_int] in Rust, not a zero-sized array, [libc::c_int; 0]. Moreover, we can't derive Copy and Clone for dynamically sized types in Rust, and we can't use size_of to get the size of the struct since it's dynamically sized. In C, sizeof assumes the dynamically sized fields are zero-sized, so we should use that behavior.

kkysen avatar Jun 02 '22 17:06 kkysen

This bug is due to c2rust transpile mistranslating &a[i]. This should be purely an a.offset(i) without any * derefs (which cause UB on the out-of-bounds deref) or &/&mut refs. That is, c2rust transpile should know the [] and & cancel out, as that's how they're treated in C.

kkysen avatar Jun 29 '22 03:06 kkysen

This is a little more complicated than that. Two major C compilers (Clang & MSVC) consider this undefined behaviour as &a[i] is equivalent to &*(a + i) and therefore dereferences before it gets the address. Only GCC considers it equivalent to (a + i). The problem is neither compiler catches this, especially by default. The only tool I've seen that catches it for malloced memory is the MVSC static analyzer: https://godbolt.org/z/fEo6877Es .

Having said that, Clang will give a warning for the same problem if the array in question has known bounds. It'll even trip the undefined behavior sanitizer: https://godbolt.org/z/bv9dGK4vx .

C++ standards' wording on this is more clear, and would make this definitely undefined behavior. The real solution is to use (a + i) to get an address to one past the end of an array in C. It's more portable.

Nobody1707 avatar Oct 20 '22 17:10 Nobody1707