c2rust icon indicating copy to clipboard operation
c2rust copied to clipboard

`size_t` is translated to `libc::c_ulong` instead of `libc::size_t`

Open Dante-Broggi opened this issue 2 years ago • 3 comments

Currently c2rust translates the typedef of size_t to unsigned long in <stdlib.h> and thus embeds it's size into the type. It could translate size_t to libc::size_t or usize which would more directly encode the source, while preserving semantics.

As mentioned in #20 many integer types differ across platforms but this does not apply to size_t:

The big issue with changing libc types to usize/isize by default is that there are corner cases where the change is not sound. For example, C long generally matches isize on Linux but not on Windows, where long is always i32. There are a few C types that we can always convert, like size_t and intptr_t. Originally posted by @ahomescu in https://github.com/immunant/c2rust/issues/20#issuecomment-551343704

Presumably any integer typedef named size_t is intended to be libc::size_t, but an option could be provided.

For Example:

#include <stdlib.h>;
size_t x = 0;
intptr_t y = 0;

Translates to:

#![allow(dead_code, mutable_transmutes, non_camel_case_types, non_snake_case, non_upper_case_globals, unused_assignments, unused_mut)]
#![register_tool(c2rust)]
#![feature(register_tool)]
pub type size_t = libc::c_ulong;
pub type __intptr_t = libc::c_long;
#[no_mangle]
pub static mut x: size_t = 0 as libc::c_int as size_t;
#[no_mangle]
pub static mut y: __intptr_t = 0 as libc::c_int as __intptr_t;

Instead of:

#![allow(dead_code, mutable_transmutes, non_camel_case_types, non_snake_case, non_upper_case_globals, unused_assignments, unused_mut)]
#![register_tool(c2rust)]
#![feature(register_tool)]
pub type size_t = libc::size_t; // or use libc::size_t;
pub type __intptr_t = libc::intptr_t;
#[no_mangle]
pub static mut x: size_t = 0 as libc::c_int as size_t;
#[no_mangle]
pub static mut y: __intptr_t = 0 as libc::c_int as __intptr_t;

Dante-Broggi avatar Mar 19 '23 20:03 Dante-Broggi

I've seen this apply to inline casts as well, where you end up with:

::core::mem::size_of::<libc::c_uint>() as libc::c_ulong
// or
::core::mem::size_of::<libc::c_uint>() as libc::c_ulong as libc::c_ulonglong

Instead of

::core::mem::size_of::<libc::c_uint>() as libc::size_t

This seems to be an issue for all integer types, though, where uint64_t etc will still be substituted with libc::c_ulong etc.

pitaj avatar May 07 '23 00:05 pitaj

I'm trying to figure this issue out. What puzzles me is why the transpiler is "evaluating" the typedef instead of using its name as-is. For standard C library functions like malloc, strlen, memset etc that use size_t, the transpiler is producing code that declares them as taking c_ulong even when the C declarations themselves use size_t. Shouldn't the transpiler be preserving the definition as declared in the headers?

Rua avatar May 02 '24 20:05 Rua

What puzzles me is why the transpiler is "evaluating" the typedef instead of using its name as-is.

I haven't looked at this in a while, but it's probably coming from clang. size_t is a special built-in type, equal to the __SIZE_TYPE__ macro. The clang headers have typedef __SIZE_TYPE__ size_t; but it is possible that the transpiler isn't picking those up correctly.

ahomescu avatar May 02 '24 21:05 ahomescu