stdarch
stdarch copied to clipboard
rustc crashes when trying to bench upcoming neon-support in RustFFT with latest stdarch.
I'm working on adding neon support to RustFFT, and wanted to try the vld*
and vst*
instrinsics added here: https://github.com/rust-lang/stdarch/pull/1224
First results were promising, but now I'm having a hard time running benchmarks because rustc crashes when building the benches. It crashes quite hard, without giving any useful error message. I'm using rust commit d14731c
(simply master from yesterday, have also tried with a version from a couple of days ago with the same result), with stdarch updated to commit 931cdfb
.
I would like to investigate this and try to at least help solve it, but I have no idea were to start. Any advice?
I'm trying to bench this branch: https://github.com/HEnquist/RustFFT/tree/vldx
I have tried on both a raspberry pi, and on an Oracle Ampere VM, with the same results.
Error:
pi@raspberrypi:~/RustFFT $ cargo bench --features neon neon_
Compiling rustfft v6.0.1 (/home/pi/RustFFT)
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x826e48)[0x7f919b2e48]
linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x7f96f1a788]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x11b83a4)[0x7f923443a4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x16b1810)[0x7f9283d810]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246e1f4)[0x7f935fa1f4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246e350)[0x7f935fa350]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246f570)[0x7f935fb570]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb5d674)[0x7f91ce9674]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb48ba4)[0x7f91cd4ba4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb4cfe8)[0x7f91cd8fe8]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa24380)[0x7f91bb0380]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa1eb14)[0x7f91baab14]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa95048)[0x7f91c21048]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xaf9320)[0x7f91c85320]
/mount/ssd/rustc/lib/libstd-ba7780e7efc39bcb.so(rust_metadata_std_2d103c436cd22770+0x8c380)[0x7f9108b380]
/lib/aarch64-linux-gnu/libpthread.so.0(+0x77e4)[0x7f90e3d7e4]
/lib/aarch64-linux-gnu/libc.so.6(+0xcfadc)[0x7f90f48adc]
error: could not compile `rustfft`
Caused by:
process didn't exit successfully: `rustc --crate-name rustfft --edition=2018 src/lib.rs --error-format=json --json=diagnostic-rendered-ansi --emit=dep-info,link -C opt-level=3 -C embed-bitcode=no --test --cfg 'feature="avx"' --cfg 'feature="default"' --cfg 'feature="neon"' --cfg 'feature="sse"' -C metadata=541f3979c27e3f0a -C extra-filename=-541f3979c27e3f0a --out-dir /home/pi/RustFFT/target/release/deps -L dependency=/home/pi/RustFFT/target/release/deps --extern num_complex=/home/pi/RustFFT/target/release/deps/libnum_complex-d7ededc7dd339a27.rlib --extern num_integer=/home/pi/RustFFT/target/release/deps/libnum_integer-0edf0ad8b3f42ac1.rlib --extern num_traits=/home/pi/RustFFT/target/release/deps/libnum_traits-7542682cf91f65c6.rlib --extern paste=/home/pi/RustFFT/target/release/deps/libpaste-49b243a423c645cd.so --extern primal_check=/home/pi/RustFFT/target/release/deps/libprimal_check-d5a43e363432ab49.rlib --extern rand=/home/pi/RustFFT/target/release/deps/librand-5cd04db47872812e.rlib --extern strength_reduce=/home/pi/RustFFT/target/release/deps/libstrength_reduce-1c2f1a65415e918e.rlib --extern transpose=/home/pi/RustFFT/target/release/deps/libtranspose-226cfc662b715315.rlib` (signal: 11, SIGSEGV: invalid memory reference)
warning: build failed, waiting for other jobs to finish...
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x826e48)[0x7f7f990e48]
linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x7f84ef8788]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x11b83a4)[0x7f803223a4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x16b1810)[0x7f8081b810]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246e1f4)[0x7f815d81f4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246e350)[0x7f815d8350]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246f570)[0x7f815d9570]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb5d674)[0x7f7fcc7674]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb48ba4)[0x7f7fcb2ba4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb4cfe8)[0x7f7fcb6fe8]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa24380)[0x7f7fb8e380]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa1eb14)[0x7f7fb88b14]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa95048)[0x7f7fbff048]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xaf9320)[0x7f7fc63320]
/mount/ssd/rustc/lib/libstd-ba7780e7efc39bcb.so(rust_metadata_std_2d103c436cd22770+0x8c380)[0x7f7f069380]
/lib/aarch64-linux-gnu/libpthread.so.0(+0x77e4)[0x7f7ee1b7e4]
/lib/aarch64-linux-gnu/libc.so.6(+0xcfadc)[0x7f7ef26adc]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x826e48)[0x7fafd07e48]
linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x7fb526f788]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x11b83a4)[0x7fb06993a4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x16b1810)[0x7fb0b92810]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246e1f4)[0x7fb194f1f4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246e350)[0x7fb194f350]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246f570)[0x7fb1950570]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb5d674)[0x7fb003e674]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb48ba4)[0x7fb0029ba4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb4cfe8)[0x7fb002dfe8]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa24380)[0x7faff05380]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa1eb14)[0x7fafeffb14]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa95048)[0x7faff76048]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xaf9320)[0x7faffda320]
/mount/ssd/rustc/lib/libstd-ba7780e7efc39bcb.so(rust_metadata_std_2d103c436cd22770+0x8c380)[0x7faf3e0380]
/lib/aarch64-linux-gnu/libpthread.so.0(+0x77e4)[0x7faf1927e4]
/lib/aarch64-linux-gnu/libc.so.6(+0xcfadc)[0x7faf29dadc]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x826e48)[0x7faa5c3e48]
linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x7fafb2b788]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x11b83a4)[0x7faaf553a4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x16b1810)[0x7fab44e810]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246e1f4)[0x7fac20b1f4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246e350)[0x7fac20b350]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0x246f570)[0x7fac20c570]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb5d674)[0x7faa8fa674]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb48ba4)[0x7faa8e5ba4]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xb4cfe8)[0x7faa8e9fe8]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa24380)[0x7faa7c1380]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa1eb14)[0x7faa7bbb14]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xa95048)[0x7faa832048]
/mount/ssd/rustc/lib/librustc_driver-b200c9ef89c7aa6a.so(+0xaf9320)[0x7faa896320]
/mount/ssd/rustc/lib/libstd-ba7780e7efc39bcb.so(rust_metadata_std_2d103c436cd22770+0x8c380)[0x7fa9c9c380]
/lib/aarch64-linux-gnu/libpthread.so.0(+0x77e4)[0x7fa9a4e7e4]
/lib/aarch64-linux-gnu/libc.so.6(+0xcfadc)[0x7fa9b59adc]
error: build failed
Please include rustc --version --verbose
.
Sure! It doesn't say that much unfortunately.
pi@raspberrypi:~ $ /mount/ssd/rustc/bin/rustc --version --verbose
rustc 1.57.0-dev
binary: rustc
commit-hash: unknown
commit-date: unknown
host: aarch64-unknown-linux-gnu
release: 1.57.0-dev
LLVM version: 13.0.0
Can you find out which lines of code in the bench caused the crash? This might help to find the root cause.
LLVM version: 13.0.0
This is what I wanted to make sure of. The last "official" LLVM 13.0.0 (Rust was merging the "release candidate" versions to be able to get a head start on testing) was pulled in shortly after you posted this, so it may be a good idea to try today's rustc.
It doesn't seem to matter much what my benches contain, it fails no matter what. I just started building rustc from today, will try it as soon as it's ready (tomorrow probably, takes some time on a Raspberry Pi..)
aarch64-unknown-linux-gnu
is a tier 1 target: It should be possible to download the latest nightly via rustup
, no? No need to recompile it.
I need a newer stdarch than in the latest nightly, with all the vld* and vst* intrinsics.
The updated llvm unfortunately made no difference. If I go back to the RustFFT version just before I started using the vld* and vst* intrinsics builds and benches fine. I'll try to figure out exactly what change triggers the crash. Unfortunately I'm a bit short on time these days, so may take a while.
The crash seems to come when I compile a benchmark if my FFTs use this function: https://github.com/HEnquist/RustFFT/blob/vldx/src/neon/neon_vector.rs#L156
It only fails with cargo bench
, with cargo test
it's all good.
We can replace vld2q_f64
with a fn
with equivalent behavior and see if the crash will still happen:
pub unsafe fn vld2q_f64_fake(a: *const f64) -> float64x2x2_t {
let x: [f64; 4] = core::ptr::read_unaligned(a.cast());
transmute([x[0], x[2], x[1], x[3]])
}
Using the vld2q_f64_fake
instead of vld2q_f64
makes the benches build and run fine!
By the way, vld3q_f64
and vld4q_f64
cause no problems. No need for fake-versions of those to make the benches ok.
That is interesting. I think vld2q_f64
may have special requirements for align. This requires specific analysis of llvm's implementation of vld2
. Unfortunately I am not good at this part.
Can you show the assembly emitted for each of those intrinsics, as it looks like in the final bench binary, @HEnquist? This will likely require a disassembly tool rather than relying on --emit=asm
or anything. it also likely requires surrounding context in terms of assembly, hopefully not everything, just each bench test.
I have looked at this a bit and can reproduce this on a Mac M1.
Rustc crashes in LLVM codegen:
Process 26362 stopped
* thread #7, name = 'LTO bench_rustfft_neon.f5e027c6-cgu.1', stop reason = EXC_BAD_ACCESS (code=1, address=0x400010a125b10)
frame #0: 0x0000000100530534 librustc_driver-69ff7149a4f34321.dylib`llvm::CallInst::Create(llvm::FunctionType*, llvm::Value*, llvm::ArrayRef<llvm::Value*>, llvm::ArrayRef<llvm::OperandBundleDefT<llvm::Value*> >, llvm::Twine const&, llvm::Instruction*) + 320
librustc_driver-69ff7149a4f34321.dylib`llvm::CallInst::Create:
-> 0x100530534 <+320>: ldr x8, [x25, #0x10]
0x100530538 <+324>: ldr x1, [x8]
0x10053053c <+328>: cbz x20, 0x100530560 ; <+364>
0x100530540 <+332>: mov w8, #0x30
Full backtrace:
(lldb) bt
* thread #7, name = 'LTO bench_rustfft_neon.f5e027c6-cgu.1', stop reason = EXC_BAD_ACCESS (code=1, address=0x400010a125b10)
* frame #0: 0x0000000100530534 librustc_driver-69ff7149a4f34321.dylib`llvm::CallInst::Create(llvm::FunctionType*, llvm::Value*, llvm::ArrayRef<llvm::Value*>, llvm::ArrayRef<llvm::OperandBundleDefT<llvm::Value*> >, llvm::Twine const&, llvm::Instruction*) + 320
frame #1: 0x0000000100530294 librustc_driver-69ff7149a4f34321.dylib`llvm::IRBuilderBase::CreateCall(llvm::FunctionType*, llvm::Value*, llvm::ArrayRef<llvm::Value*>, llvm::Twine const&, llvm::MDNode*) + 80
frame #2: 0x0000000101186b50 librustc_driver-69ff7149a4f34321.dylib`llvm::AArch64TargetLowering::lowerInterleavedLoad(llvm::LoadInst*, llvm::ArrayRef<llvm::ShuffleVectorInst*>, llvm::ArrayRef<unsigned int>, unsigned int) const + 852
frame #3: 0x0000000101538c6c librustc_driver-69ff7149a4f34321.dylib`(anonymous namespace)::InterleavedAccess::runOnFunction(llvm::Function&) + 4868
frame #4: 0x0000000101f42030 librustc_driver-69ff7149a4f34321.dylib`llvm::FPPassManager::runOnFunction(llvm::Function&) + 672
frame #5: 0x0000000101f477c0 librustc_driver-69ff7149a4f34321.dylib`llvm::FPPassManager::runOnModule(llvm::Module&) + 52
frame #6: 0x0000000101f42528 librustc_driver-69ff7149a4f34321.dylib`llvm::legacy::PassManagerImpl::run(llvm::Module&) + 856
frame #7: 0x00000001003d8a40 librustc_driver-69ff7149a4f34321.dylib`LLVMRustWriteOutputFile + 692
frame #8: 0x00000001002e16a4 librustc_driver-69ff7149a4f34321.dylib`rustc_codegen_llvm::back::write::write_output_file::h8c4897ade22bc53c + 204
frame #9: 0x000000010034fff0 librustc_driver-69ff7149a4f34321.dylib`rustc_codegen_llvm::back::write::codegen::with_codegen::ha82c7a362395cd34 + 116
frame #10: 0x00000001002e4bd4 librustc_driver-69ff7149a4f34321.dylib`rustc_codegen_llvm::back::write::codegen::h8d756782e432dc6c + 2524
frame #11: 0x00000001003140e0 librustc_driver-69ff7149a4f34321.dylib`rustc_codegen_ssa::back::write::finish_intra_module_work::h079cbdb2f84c889e + 184
frame #12: 0x000000010030f890 librustc_driver-69ff7149a4f34321.dylib`rustc_codegen_ssa::back::write::execute_work_item::hfb8dd85525a92ee7 + 780
frame #13: 0x00000001003bf5f4 librustc_driver-69ff7149a4f34321.dylib`std::sys_common::backtrace::__rust_begin_short_backtrace::h086be9b8ac7cc110 + 176
frame #14: 0x000000010032eea0 librustc_driver-69ff7149a4f34321.dylib`std::panicking::try::hb23e946ef2c82654 + 52
frame #15: 0x000000010039114c librustc_driver-69ff7149a4f34321.dylib`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h7660a4fef4f4ca66 + 128
frame #16: 0x0000000107747fb0 libstd-5be8030cf9a973ad.dylib`_$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h6f4298f91d78694f + 36
frame #17: 0x000000010775b14c libstd-5be8030cf9a973ad.dylib`std::sys::unix::thread::Thread::new::thread_start::h947b820fbfb10caa + 36
frame #18: 0x00000001884a7878 libsystem_pthread.dylib`_pthread_start + 320
Trying to narrow it down:
- It can also be reproduced by trying to build the tests with
cargo +stage1 build --features neon --release --tests
- It also happens trying to build the
asmtest.rs
example with that switched to neon. - If the modified asmtest.rs is added to a module in the library sources itself forcing monomorphization building the library in release mode fails as well.
With --emit=llvm-ir
I can already get a file which reproduces the LLVM codegen crash with llc but it is too big and it might still be bad IR that rustc emits. I will look into it further when I have time.
I didn't have any time to continue on this (and I think that I probably know too little about this stuff to be useful anyway). Did anyone else make any progress?
Rust recently upgraded to LLVM 14, can you try this on the latest nightly to see if it is still an issue?
I'll try asap and report back!
Rust recently upgraded to LLVM 14, can you try this on the latest nightly to see if it is still an issue?
I just tried this, and unfortunately the newer LLVM doesn't seem to make any difference.
I think rustc generates correct instructions.
https://developer.arm.com/architectures/instruction-sets/intrinsics/vld2q_f64
use core::arch::aarch64::*;
#[inline(never)]
pub unsafe fn vld2q_f64_real(p: *const f64) -> float64x2x2_t {
vld2q_f64(p)
}
#[inline(never)]
pub unsafe fn vld2q_f64_fake(a: *const f64) -> float64x2x2_t {
let x: [float64x1_t; 4] = core::ptr::read_unaligned(a.cast());
core::mem::transmute([x[0], x[2], x[1], x[3]])
}
example::vld2q_f64_real:
ld2 { v0.2d, v1.2d }, [x0]
stp q0, q1, [x8]
ret
example::vld2q_f64_fake:
ldp d0, d2, [x0]
ldp d1, d3, [x0, #16]
str d0, [x8]
str d2, [x8, #16]
str d1, [x8, #8]
str d3, [x8, #24]
ret
It may be related with a recent issue. The latest nightly has upgraded to LLVM 15.0.4.
- https://github.com/rust-lang/rust/issues/102738
I just got back to this after a small break :) Things are working just fine on recent rustc versions.