Assertion error in function pointer fmt::Debug implementation - LLVM ERROR: Cannot select: t2: i16 = addrspacecast[1 -> 0]
Raised from discussion on #137.
LLVM ERROR: Cannot select: t2: i16 = addrspacecast[1 -> 0] undef:i16
t1: i16 = undef
In function: _ZN4core3ptr87_$LT$impl$u20$core..fmt..Debug$u20$for$u20$unsafe$u20$fn$LP$A$RP$$u20$.$GT$$u20$Ret$GT$3fmt17h0fdf8ca7140def9bE
The bug is exposed by the std::fmt::Debug implementation for functions and function pointers. Rust emits a pointer dereference with a target pointer type of addrspace(0)* i16, using LLVM routines that will implicitly insert a bitcast (iN -> iM bits) or an address space cast if necessary. In the case of function pointers, Rust should codegen a addrspace(1)* i16 pointer, so that LLVM doesn't have to map (addrspacecast) PROGMEM pointers to RAM pointers an impossible task because there is no memory mapping and the address spaces do not overlap.
Here is the full LLVM IR for libcore from the #137 branch libcore.ll.
Here is the IR of the failing function
; <&T as core::fmt::Debug>::fmt
; Function Attrs: uwtable
define internal zeroext i1 @"_ZN42_$LT$$RF$T$u20$as$u20$core..fmt..Debug$GT$3fmt17hbd9cadb576049f93E"({ i8*, i8* } ({}*) addrspace(1)*** noalias nocapture readonly align 1 dereferenceable(2) %self, %"fmt::Formatter"* align 1 dereferenceable(27) %f) unnamed_addr addrspace(1) #6 {
start:
%buf.i.i.i.i = alloca [128 x i8], align 1
%0 = bitcast { i8*, i8* } ({}*) addrspace(1)*** %self to {} addrspace(1)***
%1 = load {} addrspace(1)**, {} addrspace(1)*** %0, align 1, !nonnull !456
%2 = load {} addrspace(1)*, {} addrspace(1)** %1, align 1, !alias.scope !5154, !nonnull !456
%3 = addrspacecast {} addrspace(1)* %2 to {}*
%4 = ptrtoint {}* %3 to i16
%5 = getelementptr inbounds %"fmt::Formatter", %"fmt::Formatter"* %f, i16 0, i32 7, i32 0
%6 = load i8, i8* %5, align 1, !range !2, !noalias !5157
%7 = getelementptr inbounds %"fmt::Formatter", %"fmt::Formatter"* %f, i16 0, i32 7, i32 1
%8 = load i16, i16* %7, align 1, !noalias !5157
%9 = bitcast %"fmt::Formatter"* %f to i32*
%10 = load i32, i32* %9, align 1, !noalias !5157
%11 = and i32 %10, 4
%12 = icmp eq i32 %11, 0
br i1 %12, label %bb6.i.i, label %bb2.i.i
bb2.i.i: ; preds = %start
%13 = or i32 %10, 8
store i32 %13, i32* %9, align 1, !noalias !5157
%14 = icmp eq i8 %6, 0
br i1 %14, label %bb3.i.i, label %bb6.i.i
bb3.i.i: ; preds = %bb2.i.i
store i16 6, i16* %7, align 1, !noalias !5157
store i8 1, i8* %5, align 1, !noalias !5157
br label %bb6.i.i
bb6.i.i: ; preds = %bb3.i.i, %bb2.i.i, %start
%15 = phi i32 [ %10, %start ], [ %13, %bb3.i.i ], [ %13, %bb2.i.i ]
%16 = or i32 %15, 4
store i32 %16, i32* %9, align 1, !noalias !5157
%17 = getelementptr inbounds [128 x i8], [128 x i8]* %buf.i.i.i.i, i16 0, i16 0
call addrspace(1) void @llvm.lifetime.start.p0i8(i64 128, i8* nonnull %17), !noalias !5160
%18 = getelementptr inbounds [128 x i8], [128 x i8]* %buf.i.i.i.i, i16 0, i16 128
br label %bb15.i.i.i.i
bb15.i.i.i.i: ; preds = %bb15.i.i.i.i, %bb6.i.i
%iter.sroa.4.0.i.i.i.i = phi i8* [ %18, %bb6.i.i ], [ %19, %bb15.i.i.i.i ]
%x.0.i.i.i.i = phi i16 [ %4, %bb6.i.i ], [ %20, %bb15.i.i.i.i ]
%curr.0.i.i.i.i = phi i16 [ 128, %bb6.i.i ], [ %26, %bb15.i.i.i.i ]
%19 = getelementptr inbounds i8, i8* %iter.sroa.4.0.i.i.i.i, i16 -1
%20 = lshr i16 %x.0.i.i.i.i, 4
%21 = trunc i16 %x.0.i.i.i.i to i8
%22 = and i8 %21, 15
%23 = icmp ult i8 %22, 10
%24 = or i8 %22, 48
%25 = add nuw nsw i8 %22, 87
%_0.0.i15.i.i.i.i = select i1 %23, i8 %24, i8 %25
store i8 %_0.0.i15.i.i.i.i, i8* %19, align 1, !noalias !5160
%26 = add nsw i16 %curr.0.i.i.i.i, -1
%27 = icmp eq i16 %20, 0
br i1 %27, label %bb43.i.i.i.i, label %bb15.i.i.i.i
bb43.i.i.i.i: ; preds = %bb15.i.i.i.i
%28 = icmp ugt i16 %26, 128
br i1 %28, label %bb1.i.i.i.i.i.i.i, label %"_ZN4core3ptr87_$LT$impl$u20$core..fmt..Debug$u20$for$u20$unsafe$u20$fn$LP$A$RP$$u20$.$GT$$u20$Ret$GT$3fmt17ha5a3565824dd93e2E.exit"
bb1.i.i.i.i.i.i.i: ; preds = %bb43.i.i.i.i
; call core::slice::slice_index_order_fail
tail call addrspace(1) void @_ZN4core5slice22slice_index_order_fail17h1a521a1329c8fcf9E(i16 %26, i16 128) #17, !noalias !5160
unreachable
"_ZN4core3ptr87_$LT$impl$u20$core..fmt..Debug$u20$for$u20$unsafe$u20$fn$LP$A$RP$$u20$.$GT$$u20$Ret$GT$3fmt17ha5a3565824dd93e2E.exit": ; preds = %bb43.i.i.i.i
%29 = getelementptr inbounds [128 x i8], [128 x i8]* %buf.i.i.i.i, i16 0, i16 %26
%30 = sub i16 129, %curr.0.i.i.i.i
%_3.sroa.0.0._3.sroa.0.0..cast.i.i.i.i.i = bitcast i8* %29 to [0 x i8]*
; call core::fmt::Formatter::pad_integral
%31 = call zeroext addrspace(1) i1 @_ZN4core3fmt9Formatter12pad_integral17h8cbb788acd2ae784E(%"fmt::Formatter"* nonnull align 1 dereferenceable(27) %f, i1 zeroext true, [0 x i8]* noalias nonnull readonly align 1 bitcast (<{ [2 x i8] }>* @anon.1eadd1440c2e72f10a9c27e27f9e4574.191 to [0 x i8]*), i16 2, [0 x i8]* noalias nonnull readonly align 1 %_3.sroa.0.0._3.sroa.0.0..cast.i.i.i.i.i, i16 %30), !noalias !5160
call addrspace(1) void @llvm.lifetime.end.p0i8(i64 128, i8* nonnull %17), !noalias !5160
store i8 %6, i8* %5, align 1, !noalias !5157
store i16 %8, i16* %7, align 1, !noalias !5157
store i32 %10, i32* %9, align 1, !noalias !5157
ret i1 %31
}
It seems like LLVM/Rust is choking because when debug printing function pointers.
This is the shoddy IR
%3 = addrspacecast {} addrspace(1)* %2 to {}*
%4 = ptrtoint {}* %3 to i16
In order to convert an addrspace(1) pointer to an integer for debug printing, it first attempts to convert it to a pointer in the default address space zero, and then convert that pointer to an integer for printing.
Think of it like this PROGMEM function pointer -> RAM pointer -> integer representation.
On AVR, you cannot convert a PROGMEM function pointer into a RAM pointer (the physical memory spaces are disjoint). Also, it probably makes more sense to print the PROGMEM function pointer when debug printing functions.
We should just go PROGMEM function pointer -> integer representation. The address space cast is unnecessary, and in fact, harmful.
I am not sure if it is possible to implement the addrspacecast instruction in the AVR backend.
We need to modify Rust so that it does not attempt this address space conversion and instead use ptrtoint on the addrspace(1) pointer.
https://github.com/avr-rust/rust/pull/137#issuecomment-500090606
The underlying issue is that
*const Talways hasaddrspace(0). I think explicit casts (ptr as *const T) should preserve the address space of their input here.@ecstatic-morse
Have we yet figured out which function pointer is being printed?
Not sure, but I've managed to work around it by replacing the implementation of the function pointer Debug::fmt implementations with Ok(()).
As of 20 seconds ago, I successfully compiled the blink program to an ELF under release mode with this branch (current tip 8b7240e60bc9609e1efb0552d847dd2fa6af17d9).
We do have a history of adding hacks like that (especially to the formatting infrastructure) to get things working again. I'd be cool with adding it, changed to hard-code the string "disabled due to #143" so it's not just completely opaque ;-)
Good point, I've updated the commit and pushed 8b7240e60bc9609e1efb0552d847dd2fa6af17d9.
The current patch is this
diff --git a/src/libcore/ptr/mod.rs b/src/libcore/ptr/mod.rs
index a04bb11311..caef99ca47 100644
--- a/src/libcore/ptr/mod.rs
+++ b/src/libcore/ptr/mod.rs
@@ -2631,14 +2631,14 @@ macro_rules! fnptr_impls_safety_abi {
#[stable(feature = "fnptr_impls", since = "1.4.0")]
impl<Ret, $($Arg),*> fmt::Pointer for $FnTy {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
- "disabled due to avr-rust/rust#143".fmt(f)
+ fmt::Pointer::fmt(&(*self as usize as *const ()), f)
}
}
#[stable(feature = "fnptr_impls", since = "1.4.0")]
impl<Ret, $($Arg),*> fmt::Debug for $FnTy {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
- "disabled due to avr-rust/rust#143".fmt(f)
+ fmt::Pointer::fmt(&(*self as usize as *const ()), f)
}
}
}
Preserving @shepmaster's comment from #137.
From an end goal standpoint, will the numeric values of the addresses ever overlap? That is, will we ever print 0x1000/space0 and 0x1000/space1 as the same value? If so, that would rather suck from a debugging persepective.
This occurs outside of Debug formatting. For example, the implementations of async/await executors do function pointer conversions. This could be really annoying.
Tiny Rust example:
fn alpha(i: i32) -> i32 { i + 1 }
static FOO: fn(i32) -> i32 = alpha;
The arguments to ConstantExpr::getBitCast:
(lldb) p C->dump()
; Function Attrs: minsize optsize uwtable
define internal i32 @_ZN5blink4main5alpha17he7cdc130812da1f7E(i32) unnamed_addr addrspace(1) #4 !dbg !256 {
%2 = alloca i32, align 1
store i32 %0, i32* %2, align 1
call addrspace(1) void @llvm.dbg.declare(metadata i32* %2, metadata !262, metadata !DIExpression()), !dbg !263
%3 = load i32, i32* %2, align 1, !dbg !264
%4 = add i32 %3, 1, !dbg !264
ret i32 %4, !dbg !265
}
(lldb) p DstTy->dump()
i8*
This fails at
if (SrcPtrTy->getAddressSpace() != DstPtrTy->getAddressSpace())
Ok, I pinged some smart compiler devs:
Nikita Popov
From a quick search, we have https://github.com/rust-lang/rust/blob/49d139c64b69ec5289f9f81db885ecfc2c7a8366/src/librustc_codegen_llvm/abi.rs#L365 and https://github.com/rust-lang/rust/blob/49d139c64b69ec5289f9f81db885ecfc2c7a8366/src/librustc_codegen_llvm/type_.rs#L308.
Possibly the code in https://github.com/rust-lang/rust/blob/da9ebc828c982d2ed49396886da85011e1b0a6c0/src/librustc_codegen_ssa/mir/rvalue.rs#L167 is relevant, it deals a lot with casting around function types.
eddyb
the i8* is coming from us lowering miri allocations to LLVM
it wouldn't be that hard to change the addrspace when lowering a pointer to a function, as opposed to a data pointer (cc oli) https://github.com/rust-lang/rust/blob/c4797fa4f4a696b183b3aa1517ee22c78d0f5d7a/src/librustc_codegen_llvm/common.rs#L322
so there's a cast here, where type_i8p shouldn't always be the same addrspace: https://github.com/rust-lang/rust/blob/c4797fa4f4a696b183b3aa1517ee22c78d0f5d7a/src/librustc_codegen_llvm/common.rs#L331
and then here, either check self.tcx.alloc_map.lock().get(alloc_id) for GlobalAlloc::Function or avoid the cast in scalar_to_backend by not passing in a type to scalar_to_backend (and so move the cast at the end of scalar_to_backend to its other 2 callers) https://github.com/rust-lang/rust/blob/c4797fa4f4a696b183b3aa1517ee22c78d0f5d7a/src/librustc_codegen_llvm/consts.rs#L50
oli
yea, in https://github.com/rust-lang/rust/blob/c4797fa4f4a696b183b3aa1517ee22c78d0f5d7a/src/librustc_codegen_llvm/common.rs#L322 we could just return early instead of falling through to the cast
eddyb walked me through something that seems to at least compile; I've hit other issues with async/await so my program as a whole doesn't link, but might be worth someone else trying this patch out and seeing if the weird cast we added for Debug can be avoided with it:
commit f7d396073a3adb5ab8ddcd203cd31abeb8ee7d21
Author: Jake Goulding <[email protected]>
Date: Thu Jun 13 15:42:36 2019 -0400
Special-case this function pointer thing
diff --git a/src/librustc_codegen_llvm/common.rs b/src/librustc_codegen_llvm/common.rs
index 0b23aac522..5cab9f47db 100644
--- a/src/librustc_codegen_llvm/common.rs
+++ b/src/librustc_codegen_llvm/common.rs
@@ -308,6 +308,7 @@ impl ConstMethods<'tcx> for CodegenCx<'ll, 'tcx> {
}
},
Scalar::Ptr(ptr) => {
+ let mut do_something_special = false;
let alloc_kind = self.tcx.alloc_map.lock().get(ptr.alloc_id);
let base_addr = match alloc_kind {
Some(GlobalAlloc::Memory(alloc)) => {
@@ -319,6 +320,7 @@ impl ConstMethods<'tcx> for CodegenCx<'ll, 'tcx> {
}
}
Some(GlobalAlloc::Function(fn_instance)) => {
+ do_something_special = true;
self.get_fn(fn_instance)
}
Some(GlobalAlloc::Static(def_id)) => {
@@ -327,15 +329,19 @@ impl ConstMethods<'tcx> for CodegenCx<'ll, 'tcx> {
}
None => bug!("missing allocation {:?}", ptr.alloc_id),
};
- let llval = unsafe { llvm::LLVMConstInBoundsGEP(
- self.const_bitcast(base_addr, self.type_i8p()),
- &self.const_usize(ptr.offset.bytes()),
- 1,
- ) };
- if layout.value != layout::Pointer {
- unsafe { llvm::LLVMConstPtrToInt(llval, llty) }
+ if do_something_special {
+ return base_addr; // move to match?
} else {
- self.const_bitcast(llval, llty)
+ let llval = unsafe { llvm::LLVMConstInBoundsGEP(
+ self.const_bitcast(base_addr, self.type_i8p()),
+ &self.const_usize(ptr.offset.bytes()),
+ 1,
+ ) };
+ if layout.value != layout::Pointer {
+ unsafe { llvm::LLVMConstPtrToInt(llval, llty) }
+ } else {
+ self.const_bitcast(llval, llty)
+ }
}
}
}
An updated version of the previous patch
diff --git a/src/librustc_codegen_llvm/common.rs b/src/librustc_codegen_llvm/common.rs
index a5cda5949ee..fdbf222549e 100644
--- a/src/librustc_codegen_llvm/common.rs
+++ b/src/librustc_codegen_llvm/common.rs
@@ -244,6 +244,8 @@ impl ConstMethods<'tcx> for CodegenCx<'ll, 'tcx> {
}
}
Scalar::Ptr(ptr) => {
+ let mut do_something_special = false;
+
let base_addr = match self.tcx.global_alloc(ptr.alloc_id) {
GlobalAlloc::Memory(alloc) => {
let init = const_alloc_to_llvm(self, alloc);
@@ -256,24 +258,31 @@ impl ConstMethods<'tcx> for CodegenCx<'ll, 'tcx> {
}
value
}
- GlobalAlloc::Function(fn_instance) => self.get_fn_addr(fn_instance),
+ GlobalAlloc::Function(fn_instance) => {
+ do_something_special = true;
+ self.get_fn_addr(fn_instance)
+ },
GlobalAlloc::Static(def_id) => {
assert!(self.tcx.is_static(def_id));
assert!(!self.tcx.is_thread_local_static(def_id));
self.get_static(def_id)
}
};
- let llval = unsafe {
- llvm::LLVMConstInBoundsGEP(
- self.const_bitcast(base_addr, self.type_i8p()),
- &self.const_usize(ptr.offset.bytes()),
- 1,
- )
- };
- if layout.value != Pointer {
- unsafe { llvm::LLVMConstPtrToInt(llval, llty) }
+ if do_something_special {
+ return base_addr; // move to match?
} else {
- self.const_bitcast(llval, llty)
+ let llval = unsafe {
+ llvm::LLVMConstInBoundsGEP(
+ self.const_bitcast(base_addr, self.type_i8p()),
+ &self.const_usize(ptr.offset.bytes()),
+ 1,
+ )
+ };
+ if layout.value != Pointer {
+ unsafe { llvm::LLVMConstPtrToInt(llval, llty) }
+ } else {
+ self.const_bitcast(llval, llty)
+ }
}
}
}
Applying this does seem to prevent some of the newly-added function pointer casts I see, but I have other failures down the line.
Thanks for re-pining this @shepmaster, I lost track of this. I've been working on address space cast fixes in the same area here: https://github.com/rust-lang/rust/pull/73270 I will try and merge the patch into the PR.
I actually expect that your changes, being more thorough, may have the possibility of making this diff obsolete (fingers crossed)
That makes sense - the "other failures down the line" - were they related to function pointer casts? I'm curious because it would be an interesting data point if that patch fixed all/majority of them.
I'm still wrestling with rustc (in particular, I believe a PlaceRef containing an LLVM Value with the wrong address space is constructed early on for a function pointer (specifically the one in https://github.com/avr-rust/rust/issues/169) but the LLVM assertion error only hits later on when the bitcast is done during operand codegen.
other failures down the line" - were they related to function pointer casts?
I believe so, but it was a year ago, so my memory is fuzzy. IIRC, I was trying to get some async executors to work on AVR. Those do make use of function pointers to create a manual vtable.
FWIW, if I combine
- the diff in this issue
- your address space PR
- most of the LLVM cherry picks
Then my error when compiling libcore changes:
Assertion failed: (ExtractValueInst::getIndexedType(Agg->getType(), Idxs) == Val->getType() && "Inserted value must match indexed type!"), function init, file /Users/shep/Projects/rust/src/llvm-project/llvm/lib/IR/Instructions.cpp, line 2110.