Incorrect translation of `alloca` in block scope
Tested with c2rust 0.16.0
#include <stdio.h>
#include <string.h>
int one = 1;
int main(void) {
char *p;
if (one) {
p = __builtin_alloca(100);
}
strcpy(p, "Hello, world!");
puts(p);
}
This is a program that has no memory errors. The if (one) is guaranteed to be entered, though the compiler cannot see that, and the result of __builtin_alloca persists until the function ends, even after the block has ended.
It is translated to the following Rust code:
#![allow(dead_code, mutable_transmutes, non_camel_case_types, non_snake_case, non_upper_case_globals, unused_assignments, unused_mut)]
#![register_tool(c2rust)]
#![feature(register_tool)]
use ::c2rust_out::*;
extern "C" {
fn puts(__s: *const libc::c_char) -> libc::c_int;
fn strcpy(_: *mut libc::c_char, _: *const libc::c_char) -> *mut libc::c_char;
}
#[no_mangle]
pub static mut one: libc::c_int = 1 as libc::c_int;
unsafe fn main_0() -> libc::c_int {
let mut p: *mut libc::c_char = 0 as *mut libc::c_char;
if one != 0 {
let mut fresh0 = ::std::vec::from_elem(
0,
100 as libc::c_int as libc::c_uint as usize,
);
p = fresh0.as_mut_ptr() as *mut libc::c_char;
}
strcpy(p, b"Hello, world!\0" as *const u8 as *const libc::c_char);
puts(p);
return 0;
}
pub fn main() {
unsafe { ::std::process::exit(main_0() as i32) }
}
Note here the let mut fresh0 = ::std::vec::from_elem(0, 100 as libc::c_int as libc::c_uint as usize); in block scope. Here, unlike in the C program, the vec will be dropped when the block is exited, and p is left a dangling pointer.
A possible corrected translation would be to make fresh0 a function-scope variable.
Also, Vec is heap-allocated and alloca is stack-allocated. Shouldn't we try to preserve this?
Also,
Vecis heap-allocated andallocais stack-allocated. Shouldn't we try to preserve this?
Ideally, yes, but that may not be possible as Rust does not have alloca. There was an RFC to add it, https://github.com/rust-lang/rfcs/issues/618, but it was closed in favour of https://doc.rust-lang.org/beta/unstable-book/language-features/unsized-locals.html. The unsized-locals feature, I believe, will be a suitable replacement for many uses of alloca, but not all of them, and not the one in the example I gave here. However, as the unsized-locals feature is not yet finalised, it is possible that it will be extended and become a full replacement.
https://github.com/playXE/alloca-rs
https://github.com/playXE/alloca-rs
That only handles a subset of the possible uses of alloca as well, and I suspect it may be a smaller subset than what will be handled by that unsized-locals feature once that includes VLAs.
Besides, after c2rust translates alloca to std::vec::from_elem, c2rust-analyze will raise the error:
[ERROR @ c2rust-analyze/src/dataflow/type_check.rs:501 @ c2rust_analyze::dataflow::type_check]: TODO: visit Callee::UnknownDef(Direct { ty: fn(i32, usize) -> std::vec::Vec<i32> {std::vec::from_elem::<i32>}, def_id: DefId(5:7923 ~ alloc[3757]::vec::from_elem), substs: [i32], is_foreign: false })
it seems that c2rust-analyze can't identify this function.
I had a look through some of the code involved, to see if I could give this a try. Most of the translation functions return a WithStmts, which seems to be used to add helper statements that are needed for the translation of an expression. I wasn't able to identify any way by which WithStmts can hold variable declaration statements that are meant to "bubble up" to the start of the function block. I think adding something like that would be necessary to fix this issue, but WithStmts is so widely used, I fear it might break something.
I wasn't able to identify any way by which WithStmts can hold variable declaration statements that are meant to "bubble up" to the start of the function block.
In specific cases, alloca may be able to "bubble up", but in general, it cannot because the argument to alloca may be variable and not known at the start of the function. I think that any approach using that, although it may improve the situation, will not be a complete fix.
In specific cases,
allocamay be able to "bubble up", but in general, it cannot because the argument toallocamay be variable and not known at the start of the function. I think that any approach using that, although it may improve the situation, will not be a complete fix.
Only the declaration of the variable needs to be at the start of the function. So let mut freshN: Vec<u8>; or whatever. It can be left uninitialised, or given a default empty value.
Only the declaration of the variable needs to be at the start of the function. So
let mut freshN: Vec<u8>;or whatever. It can be left uninitialised, or given a default empty value.
Ah, thanks, I see what you're going for. In that case there is another complication: not even the number of alloca calls can be known in advance, at least in the general case. It is valid to call alloca in a loop, so it would be difficult to have a separate variable for each alloca.
With a Vec<Vec<u8>>, it might be possible to get something that works. This would just be one single variable. For a proof of concept you could possibly declare it ahead of time at the start of the function with #[allow(unused_variables)], without knowing whether the function contains any alloca. If that approach results in correct code, we could then look at how to avoid emitting that variable in functions that don't use alloca, and/or possibly ways to further improve the generated code.
Good point, I hadn't thought of that. I think that may actually make things a bit easier though, I'll see if I can give it a go soonish.