ASM wrappers produce code that violates borrow rules
With the following ARM assembly function in test.c (found in the ARM CMSIS standard library as used in RIOT-OS; disclaimer: I have no clue what this actually does, it looks like an inline wrapper around some CPU instruction), the resulting Rust code is not buildable:
#include <stdint.h>
uint64_t __SMLALDX (uint32_t op1, uint32_t op2, uint64_t acc)
{
union llreg_u{
uint32_t w32[2];
uint64_t w64;
} llr;
llr.w64 = acc;
asm volatile ("smlaldx %0, %1, %2, %3" : "=r" (llr.w32[1]), "=r" (llr.w32[0]): "r" (op1), "r" (op2) , "0" (llr.w32[1]), "1" (llr.w32[0]) );
return llr.w64;
}
After manually fixing #306 by s/asm/llvm_asm/ on c2rust-lib.rs and test.rs, the error can be shown with:
$ c2rust transpile compile_commands.json --emit-build-file
Transpiling test.c
$ cargo +nightly build
Compiling c2rust_out v0.0.0 (/tmp/asm)
warning: unused `#[macro_use]` import
--> c2rust-lib.rs:13:1
|
13 | #[macro_use]
| ^^^^^^^^^^^^
|
= note: `#[warn(unused_imports)]` on by default
error[E0499]: cannot borrow `llr.w32[_]` as mutable more than once at a time
--> test.rs:19:18
|
17 | let fresh0 = &mut llr.w32[1 as libc::c_int as usize];
| --------------------------------------- first mutable borrow occurs here
18 | let fresh1;
19 | let fresh2 = &mut llr.w32[0 as libc::c_int as usize];
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ second mutable borrow occurs here
...
24 | "r" (op2), "0" (c2rust_asm_casts::AsmCast::cast_in(fresh0, fresh4)),
| ------ first borrow later used here
error[E0503]: cannot use `llr.w32[_]` because it was mutably borrowed
--> test.rs:21:18
|
17 | let fresh0 = &mut llr.w32[1 as libc::c_int as usize];
| --------------------------------------- borrow of `llr.w32[_]` occurs here
...
21 | let fresh4 = llr.w32[1 as libc::c_int as usize];
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ use of borrowed `llr.w32[_]`
...
24 | "r" (op2), "0" (c2rust_asm_casts::AsmCast::cast_in(fresh0, fresh4)),
| ------ borrow later used here
error[E0503]: cannot use `llr.w32[_]` because it was mutably borrowed
--> test.rs:22:18
|
17 | let fresh0 = &mut llr.w32[1 as libc::c_int as usize];
| --------------------------------------- borrow of `llr.w32[_]` occurs here
...
22 | let fresh5 = llr.w32[0 as libc::c_int as usize];
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ use of borrowed `llr.w32[_]`
23 | llvm_asm!("smlaldx $0, $1, $2, $3" : "=r" (fresh1), "=r" (fresh3) : "r" (op1),
24 | "r" (op2), "0" (c2rust_asm_casts::AsmCast::cast_in(fresh0, fresh4)),
| ------ borrow later used here
error: aborting due to 3 previous errors; 1 warning emitted
Some errors have detailed explanations: E0499, E0503.
For more information about an error, try `rustc --explain E0499`.
error: could not compile `c2rust_out`
To learn more, run the command again with --verbose.
I've deliberately not passed the --target thumbv7em-none-eabihf in to cargo as that'd necessitate several other workarounds to get things running, among them #297 and #307 as well as adding panic stuff.
To explore possibilities, I've altered the fresh variables to be *mut T rather than &mut T, and only reborrow for any use in the cast_in / cast_out functions. That eventually (ie. with all the no_std workarounds and LTO) the same four instructions as plain compiling clang (cf. godbolt output).
Might it be suitable to generally go for raw pointers rather than references when building code for ASM in the first place? While C2Rust might manage to recognize some patterns where references can work, I figure (from my very limited knowledge of assembly) it's hard to tell in general when borrow semantics can be applied and when not.
I've put together a demo workaround at https://github.com/chrysn-pull-requests/c2rust/commit/a5d7880bf486b3d57f96cc5da7b71f134bb5290f -- not even PR'ing it because it's clearly not merge-ready by far.
For one, it's going for Rust code in strings wherever the AST builder lets it. Is that OK? Should this rather be done by using the builder to exhaustion? (And is this done for tangible benefits, or more a matter of style?)
Then, there's the question of whether it's really necessary to go with raw pointers. In a slightly rewritten example C file (see below under the expander), where the variables accessed in the assembly reside in independent locals, it is sufficient to move the resulting Rust code around a bit -- once the let fresh[45] lines are above the borrows, all is fine as the underlying types are Copy anyway. The fact that this is insufficient in the original example (which is, after all, what's out there in widespread C code) helps a bit to affirm my suspicion that raw pointers might be the way to go here.
Finally, the question remains whether by also changing the c2rust_asm_casts interface, anything can be simplified -- and whether it's worth it.
test2.c
#include <stdint.h>
uint64_t __SMLALDX (uint32_t op1, uint32_t op2, uint64_t acc)
{
union llreg_u{
uint32_t w32[2];
uint64_t w64;
} llr;
llr.w64 = acc;
uint32_t l0 = llr.w32[0];
uint32_t l1 = llr.w32[1];
asm volatile ("smlaldx %0, %1, %2, %3" : "=r" (l1), "=r" (l0): "r" (op1), "r" (op2) , "0" (l1), "1" (l0) );
llr.w32[0] = l0;
llr.w32[1] = l1;
return llr.w64;
}
Finally, the question remains whether by also changing the c2rust_asm_casts interface, anything can be simplified -- and whether it's worth it.
c2rust-asm-casts already operates on raw pointers and not references, so I'm not sure what could be changed there.
Other than exploring c2rust-asm-casts further, does the proposed approach sound viable?
If so, I can clean it up into a PR -- it's been working (albeit without hard testing) well for a while, and it doesn't seem like there are other solutions coming up.
(Going from the recently deprecated llvm_asm to asm might make this obsolete, but from afar that looks like a lot of work).