candle icon indicating copy to clipboard operation
candle copied to clipboard

[Cuda] u32 tensor div by f64 leads to zeroed out tensor

Open LunaticWyrm467 opened this issue 3 weeks ago • 9 comments

Description

Summary

Okay, so looks like I stumbled upon an obscure bug here. I tried reducing the bug to as small as an MRP as possible, but it looks like its cause is a number of issues that simply results in my code being the perfect storm. What I do know is that the to_device() call at line 92 is where the backtrace is pointing towards, and that the program requires a few loops to run before the crash occurs (downstream data corruption, perhaps?).

Environment

  • OS: Windows 11 x64
  • candle-core: 0.9.1
  • GPU: RTX 4060
  • CUDA toolkit: 12.4

Backtrace

C:\Users\[USER]\.cargo\registry\src\index.crates.io-1949cf8c6b5b557f\candle-kernels-0.9.1\src\indexing.cu:134: block: [0,0,0], thread: [0,0,0] Assertion `idx < dst_dim_size` failed.
C:\Users\[USER]\.cargo\registry\src\index.crates.io-1949cf8c6b5b557f\candle-kernels-0.9.1\src\indexing.cu:134: block: [0,0,0], thread: [1,0,0] Assertion `idx < dst_dim_size` failed.
C:\Users\[USER]\.cargo\registry\src\index.crates.io-1949cf8c6b5b557f\candle-kernels-0.9.1\src\indexing.cu:134: block: [0,0,0], thread: [2,0,0] Assertion `idx < dst_dim_size` failed.
Error: DriverError(CUDA_ERROR_ASSERT, "device-side assert triggered")
   0: std::backtrace_rs::backtrace::win64::trace
             at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library\std\src\..\..\backtrace\src\backtrace\win64.rs:85
   1: std::backtrace_rs::backtrace::trace_unsynchronized
             at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library\std\src\..\..\backtrace\src\backtrace\mod.rs:66
   2: std::backtrace::Backtrace::create
             at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library\std\src\backtrace.rs:331
   3: std::backtrace::Backtrace::capture
             at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library\std\src\backtrace.rs:296
   4: enum2$<candle_core::error::Error>::bt
             at C:\Users\[USER]\.cargo\registry\src\index.crates.io-1949cf8c6b5b557f\candle-core-0.9.1\src\error.rs:255
   5: candle_core::cuda_backend::error::impl$1::w::closure$0<alloc::vec::Vec<f64,alloc::alloc::Global>,cudarc::driver::result::DriverError>
             at C:\Users\[USER]\.cargo\registry\src\index.crates.io-1949cf8c6b5b557f\candle-core-0.9.1\src\cuda_backend\error.rs:60
   6: enum2$<core::result::Result<alloc::vec::Vec<f64,alloc::alloc::Global>,cudarc::driver::result::DriverError> >::map_err<alloc::vec::Vec<f64,alloc::alloc::Global>,cudarc::driver::result::DriverError,enum2$<candle_core::error::Error>,candle_core::cuda_backend:
             at C:\Users\[USER]\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\result.rs:856
   7: candle_core::cuda_backend::error::impl$1::w<alloc::vec::Vec<f64,alloc::alloc::Global>,cudarc::driver::result::DriverError>
             at C:\Users\[USER]\.cargo\registry\src\index.crates.io-1949cf8c6b5b557f\candle-core-0.9.1\src\cuda_backend\error.rs:60
   8: candle_core::cuda_backend::impl$29::to_cpu_storage
             at C:\Users\[USER]\.cargo\registry\src\index.crates.io-1949cf8c6b5b557f\candle-core-0.9.1\src\cuda_backend\mod.rs:1526
   9: candle_core::tensor::Tensor::to_device
             at C:\Users\[USER]\.cargo\registry\src\index.crates.io-1949cf8c6b5b557f\candle-core-0.9.1\src\tensor.rs:2130
  10: to_device_bug::topk_event_indices
             at .\src\main.rs:91
  11: to_device_bug::SensorVideo::poll
             at .\src\main.rs:59
  12: to_device_bug::main
             at .\src\main.rs:185
  13: core::ops::function::FnOnce::call_once<enum2$<core::result::Result<tuple$<>,enum2$<candle_core::error::Error> > > (*)(),tuple$<> >
             at C:\Users\[USER]\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\ops\function.rs:250
  14: std::sys::backtrace::__rust_begin_short_backtrace<enum2$<core::result::Result<tuple$<>,enum2$<candle_core::error::Error> > > (*)(),enum2$<core::result::Result<tuple$<>,enum2$<candle_core::error::Error> > > >
             at C:\Users\[USER]\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\sys\backtrace.rs:152
  15: std::rt::lang_start::closure$0<enum2$<core::result::Result<tuple$<>,enum2$<candle_core::error::Error> > > >
             at C:\Users\[USER]\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\rt.rs:199
  16: std::rt::lang_start_internal::closure$0
             at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library\std\src\rt.rs:168
  17: std::panicking::try::do_call
             at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library\std\src\panicking.rs:589
  18: std::panicking::try
             at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library\std\src\panicking.rs:552
  19: std::panic::catch_unwind
             at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library\std\src\panic.rs:359
  20: std::rt::lang_start_internal
             at /rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library\std\src\rt.rs:164
  21: std::rt::lang_start<enum2$<core::result::Result<tuple$<>,enum2$<candle_core::error::Error> > > >
             at C:\Users\[USER]\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\rt.rs:198
  22: main
  23: invoke_main
             at D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:78
  24: __scrt_common_main_seh
             at D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
  25: BaseThreadInitThunk
  26: RtlUserThreadStart

error: process didn't exit successfully: `target\debug\to_device_bug.exe` (exit code: 1)

MRP Download

mrp.zip

LunaticWyrm467 avatar Nov 10 '25 11:11 LunaticWyrm467