c2rust
c2rust copied to clipboard
Transpiling a copy of an unspecified value introduces UB
To the best of my knowledge, the following code, if it compiles, does not have UB:
#include <stdint.h>
int main() {
int32_t *p = malloc(sizeof(int32_t));
*p;
}
Reasoning (I'm using https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf as the reference standard):
- [7.22.3.4]
mallocallocates an object whose value is indeterminate - [6.5] When using an object without a declared type for anything other than
memcpy/memmove/typed copy, the effective type is the lvalue type - [3.19.2] An indeterminate value is either an unspecified value or a trap representation
- [6.2.6.1] A trap representation is an object representation that doesn't represent a value of the object type
- [6.2.6.2] An integer object representation is split into value bits and padding bits, where only the latter can affect the representation being trapping
- [7.20.1.1]
int32_tdesignates an signed integer type with width 32 and no padding bits - As a consequence,
int32_tno trap representations, and*pis an unspecified value - [3.4.4] Unspecified behavior (not UB!) includes, among other things, the use of an unspecified value
- [J.2] specifically mentiones that reading a trap representation from a non-
charlvalue is UB, not an indeterminate value in general
c2rust transpiles this to:
#![allow(dead_code, mutable_transmutes, non_camel_case_types, non_snake_case, non_upper_case_globals, unused_assignments, unused_mut)]
extern "C" {
fn malloc(_: libc::c_ulong) -> *mut libc::c_void;
}
pub type __int32_t = libc::c_int;
pub type int32_t = __int32_t;
unsafe fn main_0() -> libc::c_int {
let mut p: *mut int32_t = malloc(::core::mem::size_of::<int32_t>() as libc::c_ulong)
as *mut int32_t;
*p;
return 0;
}
pub fn main() {
unsafe { ::std::process::exit(main_0() as i32) }
}
which does has UB, because in Rust, reading an uninitialized value is undefined behavior (Miri, but it's also kinda common sense).
A more narrow version is this problem is this memcpy implementation, which is perfectly legal in C:
#include <stddef.h>
void my_memcpy(unsigned char* dst, unsigned char* src, size_t n) {
for (size_t i = 0; i < n; i++) {
dst[i] = src[i];
}
}
but not when transpiled with c2rust;
#![allow(dead_code, mutable_transmutes, non_camel_case_types, non_snake_case, non_upper_case_globals, unused_assignments, unused_mut)]
pub type size_t = libc::c_ulong;
#[no_mangle]
pub unsafe extern "C" fn my_memcpy(
mut dst: *mut libc::c_uchar,
mut src: *mut libc::c_uchar,
mut n: size_t,
) {
let mut i: size_t = 0 as libc::c_int as size_t;
while i < n {
*dst.offset(i as isize) = *src.offset(i as isize);
i = i.wrapping_add(1);
i;
}
}
I've demonstrated the problem with int32_t first to show that this is a wide problem, not specific to character types.
I'm not sure how to solve this. Wrapping everything in MaybeUninit would work, but that's quite unwieldy.