candle
candle copied to clipboard
[Cuda] u32 tensor div by f64 leads to zeroed out tensor
Description
Summary
Okay, so looks like I stumbled upon an obscure bug here. I tried reducing the bug to as small as an MRP as possible, but it looks like its cause is a number of issues that simply results in my code being the perfect storm. What I do know is that the to_device() call at line 92 is where the backtrace is pointing towards, and that the program requires a few loops to run before the crash occurs (downstream data corruption, perhaps?).
Environment
- OS: Windows 11 x64
- candle-core: 0.9.1
- GPU: RTX 4060
- CUDA toolkit: 12.4
Backtrace
C:\Users\[USER]\.cargo\registry\src\index.crates.io-1949cf8c6b5b557f\candle-kernels-0.9.1\src\indexing.cu:134: block: [0,0,0], thread: [0,0,0] Assertion `idx < dst_dim_size` failed.
C:\Users\[USER]\.cargo\registry\src\index.crates.io-1949cf8c6b5b557f\candle-kernels-0.9.1\src\indexing.cu:134: block: [0,0,0], thread: [1,0,0] Assertion `idx < dst_dim_size` failed.
C:\Users\[USER]\.cargo\registry\src\index.crates.io-1949cf8c6b5b557f\candle-kernels-0.9.1\src\indexing.cu:134: block: [0,0,0], thread: [2,0,0] Assertion `idx < dst_dim_size` failed.
Error: DriverError(CUDA_ERROR_ASSERT, "device-side assert triggered")
0: std::backtrace_rs::backtrace::win64::trace
at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library\std\src\..\..\backtrace\src\backtrace\win64.rs:85
1: std::backtrace_rs::backtrace::trace_unsynchronized
at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library\std\src\..\..\backtrace\src\backtrace\mod.rs:66
2: std::backtrace::Backtrace::create
at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library\std\src\backtrace.rs:331
3: std::backtrace::Backtrace::capture
at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library\std\src\backtrace.rs:296
4: enum2$<candle_core::error::Error>::bt
at C:\Users\[USER]\.cargo\registry\src\index.crates.io-1949cf8c6b5b557f\candle-core-0.9.1\src\error.rs:255
5: candle_core::cuda_backend::error::impl$1::w::closure$0<alloc::vec::Vec<f64,alloc::alloc::Global>,cudarc::driver::result::DriverError>
at C:\Users\[USER]\.cargo\registry\src\index.crates.io-1949cf8c6b5b557f\candle-core-0.9.1\src\cuda_backend\error.rs:60
6: enum2$<core::result::Result<alloc::vec::Vec<f64,alloc::alloc::Global>,cudarc::driver::result::DriverError> >::map_err<alloc::vec::Vec<f64,alloc::alloc::Global>,cudarc::driver::result::DriverError,enum2$<candle_core::error::Error>,candle_core::cuda_backend:
at C:\Users\[USER]\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\result.rs:856
7: candle_core::cuda_backend::error::impl$1::w<alloc::vec::Vec<f64,alloc::alloc::Global>,cudarc::driver::result::DriverError>
at C:\Users\[USER]\.cargo\registry\src\index.crates.io-1949cf8c6b5b557f\candle-core-0.9.1\src\cuda_backend\error.rs:60
8: candle_core::cuda_backend::impl$29::to_cpu_storage
at C:\Users\[USER]\.cargo\registry\src\index.crates.io-1949cf8c6b5b557f\candle-core-0.9.1\src\cuda_backend\mod.rs:1526
9: candle_core::tensor::Tensor::to_device
at C:\Users\[USER]\.cargo\registry\src\index.crates.io-1949cf8c6b5b557f\candle-core-0.9.1\src\tensor.rs:2130
10: to_device_bug::topk_event_indices
at .\src\main.rs:91
11: to_device_bug::SensorVideo::poll
at .\src\main.rs:59
12: to_device_bug::main
at .\src\main.rs:185
13: core::ops::function::FnOnce::call_once<enum2$<core::result::Result<tuple$<>,enum2$<candle_core::error::Error> > > (*)(),tuple$<> >
at C:\Users\[USER]\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\ops\function.rs:250
14: std::sys::backtrace::__rust_begin_short_backtrace<enum2$<core::result::Result<tuple$<>,enum2$<candle_core::error::Error> > > (*)(),enum2$<core::result::Result<tuple$<>,enum2$<candle_core::error::Error> > > >
at C:\Users\[USER]\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\sys\backtrace.rs:152
15: std::rt::lang_start::closure$0<enum2$<core::result::Result<tuple$<>,enum2$<candle_core::error::Error> > > >
at C:\Users\[USER]\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\rt.rs:199
16: std::rt::lang_start_internal::closure$0
at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library\std\src\rt.rs:168
17: std::panicking::try::do_call
at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library\std\src\panicking.rs:589
18: std::panicking::try
at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library\std\src\panicking.rs:552
19: std::panic::catch_unwind
at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library\std\src\panic.rs:359
20: std::rt::lang_start_internal
at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library\std\src\rt.rs:164
21: std::rt::lang_start<enum2$<core::result::Result<tuple$<>,enum2$<candle_core::error::Error> > > >
at C:\Users\[USER]\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\rt.rs:198
22: main
23: invoke_main
at D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:78
24: __scrt_common_main_seh
at D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
25: BaseThreadInitThunk
26: RtlUserThreadStart
error: process didn't exit successfully: `target\debug\to_device_bug.exe` (exit code: 1)