Valentin Churavy

Results 1413 comments of Valentin Churavy

`dl_iterate_phdr` is a libc call. Can you also post the screenshot you showed me that was 60% in a sys all since we hit the slow path in libunwind?

To be precise, the remote sleuthing I did pointed towards https://github.com/libunwind/libunwind/blob/3be832395426b72248969247a4a66e3c3623578d/src/dwarf/Gfind_proc_info-lsb.c#L806-L808 The profile had ~60% of time spent in the syscall `sigprocmask` and the function that was called from was...

> This profile? Had no dl_iterate_phdr... but yes, sigprocmask. Yeah I think the frames were inlined.

Yeah static shared memory should be fine, dynamic shared memory is problematic since that is a launch parameter

Yeah this sounds like a LLVM Codegen bug where we synthesis an instruction that is not valid in this context. Can you post the MWE with the minimized example?

Ok I finally managed to reproduce... This didn't occur on the A100 I was trying earlier.

``` julia> CUDA.run_compute_sanitizer() Re-starting your active Julia session... ========= COMPUTE-SANITIZER julia> include("enz.jl") Kernel worked signal (11): Segmentation fault in expression starting at /home/vchuravy/enz.jl:46 Allocations: 70190705 (Pool: 70129654; Big: 61051); GC:...

Ah without `-g2` ``` ========= Invalid Address Space ========= at 0x678 in /home/vchuravy/.julia/packages/LLVM/WjSQG/src/interop/base.jl:40:julia_grad_kernel__4354(CuDeviceArray, CuDeviceArray) ========= by thread (96,0,0) in block (0,0,0) ```

Today I learned about: ``` CUDA_ENABLE_COREDUMP_ON_EXCEPTION=1 ``` Which gives a device core dump: ``` (cuda-gdb) target cudacore core_1665152103_cyclops_165521.nvcudmp Opening GPU coredump: core_1665152103_cyclops_165521.nvcudmp CUDA Exception: Warp Invalid Address Space The exception...