futhark-pycffi
futhark-pycffi copied to clipboard
segfault using opencl
OS: NixOS unstable Machine: M2 Max Opencl driver: asahi graphics driver + rusticl
Consider this futhark module saved as segfault.fut:
-- ==
-- random input { [6400][3200]u8 }
entry main [n] [m]
(luminance: [n][m]u8): [m][n]u8
= map (\i -> map (\j -> luminance[n-1-j, i]) (iota n)) (iota m)
Which works without issues when benching with futhark bench --backend=opencl segfault.fut. However when I create a python interface from it and run it with random inputs it often segfaults. The multicore backend at least has no problems.
futhark opencl --library -o segfault segfault.fut
build_futhark_ffi segfault
and segfault.py:
from futhark_ffi import Futhark
import numpy as np
import _segfault
_ffi = Futhark(_segfault)
if __name__ == "__main__":
luminance = np.random.randint(0, 256, (6400, 3200))
pc = _ffi.from_futhark(_ffi.main(luminance))
print(pc.shape)
And running it results in a segfault.
python segfault.py
Segmentation fault (core dumped)
I expect this is a futhark pycffi issue because the segfault does not present itself using the bench utility. Valgrind indicates it's somewhere in the rusticl opencl code.
valgrind python segfault.py
==118477== Memcheck, a memory error detector
==118477== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
==118477== Using Valgrind-3.24.0 and LibVEX; rerun with -h for copyright info
==118477== Command: python segfault.py
==118477==
==118477== Thread 3 rusticl queue t:
==118477== Invalid read of size 8
==118477== at 0x48925C8: __GI_memcpy (in /nix/store/jp9c2qj2dmii4c1sqrpmr2qp592nvsli-valgrind-3.24.0/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==118477== by 0x219BEA5B: u_default_buffer_subdata (in /nix/store/j67nja68wk0932kpcc2gir95zgvn4gix-mesa-25.0.0-asahi-opencl/lib/libRusticlOpenCL.so.1.0.0)
==118477== by 0x21750E7F: mesa_rust::pipe::context::PipeContext::buffer_subdata (in /nix/store/j67nja68wk0932kpcc2gir95zgvn4gix-mesa-25.0.0-asahi-opencl/lib/libRusticlOpenCL.so.1.0.0)
==118477== by 0x216CECFB: rusticl::core::memory::Buffer::write (in /nix/store/j67nja68wk0932kpcc2gir95zgvn4gix-mesa-25.0.0-asahi-opencl/lib/libRusticlOpenCL.so.1.0.0)
==118477== by 0x216B48F7: rusticl::api::memory::enqueue_write_buffer::{{closure}} (in /nix/store/j67nja68wk0932kpcc2gir95zgvn4gix-mesa-25.0.0-asahi-opencl/lib/libRusticlOpenCL.so.1.0.0)
==118477== by 0x217328CF: core::ops::function::FnOnce::call_once{{vtable.shim}} (in /nix/store/j67nja68wk0932kpcc2gir95zgvn4gix-mesa-25.0.0-asahi-opencl/lib/libRusticlOpenCL.so.1.0.0)
==118477== by 0x216F3DB7: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once (in /nix/store/j67nja68wk0932kpcc2gir95zgvn4gix-mesa-25.0.0-asahi-opencl/lib/libRusticlOpenCL.so.1.0.0)
==118477== by 0x216FDBB7: rusticl::core::event::Event::call::{{closure}} (in /nix/store/j67nja68wk0932kpcc2gir95zgvn4gix-mesa-25.0.0-asahi-opencl/lib/libRusticlOpenCL.so.1.0.0)
==118477== by 0x216ECB43: core::option::Option<T>::map_or (in /nix/store/j67nja68wk0932kpcc2gir95zgvn4gix-mesa-25.0.0-asahi-opencl/lib/libRusticlOpenCL.so.1.0.0)
==118477== by 0x216FD91F: rusticl::core::event::Event::call (in /nix/store/j67nja68wk0932kpcc2gir95zgvn4gix-mesa-25.0.0-asahi-opencl/lib/libRusticlOpenCL.so.1.0.0)
==118477== by 0x21704B23: rusticl::core::queue::Queue::new::{{closure}} (in /nix/store/j67nja68wk0932kpcc2gir95zgvn4gix-mesa-25.0.0-asahi-opencl/lib/libRusticlOpenCL.so.1.0.0)
==118477== by 0x216A15AF: std::sys::backtrace::__rust_begin_short_backtrace (in /nix/store/j67nja68wk0932kpcc2gir95zgvn4gix-mesa-25.0.0-asahi-opencl/lib/libRusticlOpenCL.so.1.0.0)
==118477== Address 0x400b8040 is not stack'd, malloc'd or (recently) free'd
==118477==
==118477==
==118477== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==118477== Access not within mapped region at address 0x400B8040
==118477== at 0x48925C8: __GI_memcpy (in /nix/store/jp9c2qj2dmii4c1sqrpmr2qp592nvsli-valgrind-3.24.0/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==118477== by 0x219BEA5B: u_default_buffer_subdata (in /nix/store/j67nja68wk0932kpcc2gir95zgvn4gix-mesa-25.0.0-asahi-opencl/lib/libRusticlOpenCL.so.1.0.0)
==118477== by 0x21750E7F: mesa_rust::pipe::context::PipeContext::buffer_subdata (in /nix/store/j67nja68wk0932kpcc2gir95zgvn4gix-mesa-25.0.0-asahi-opencl/lib/libRusticlOpenCL.so.1.0.0)
==118477== by 0x216CECFB: rusticl::core::memory::Buffer::write (in /nix/store/j67nja68wk0932kpcc2gir95zgvn4gix-mesa-25.0.0-asahi-opencl/lib/libRusticlOpenCL.so.1.0.0)
==118477== by 0x216B48F7: rusticl::api::memory::enqueue_write_buffer::{{closure}} (in /nix/store/j67nja68wk0932kpcc2gir95zgvn4gix-mesa-25.0.0-asahi-opencl/lib/libRusticlOpenCL.so.1.0.0)
==118477== by 0x217328CF: core::ops::function::FnOnce::call_once{{vtable.shim}} (in /nix/store/j67nja68wk0932kpcc2gir95zgvn4gix-mesa-25.0.0-asahi-opencl/lib/libRusticlOpenCL.so.1.0.0)
==118477== by 0x216F3DB7: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once (in /nix/store/j67nja68wk0932kpcc2gir95zgvn4gix-mesa-25.0.0-asahi-opencl/lib/libRusticlOpenCL.so.1.0.0)
==118477== by 0x216FDBB7: rusticl::core::event::Event::call::{{closure}} (in /nix/store/j67nja68wk0932kpcc2gir95zgvn4gix-mesa-25.0.0-asahi-opencl/lib/libRusticlOpenCL.so.1.0.0)
==118477== by 0x216ECB43: core::option::Option<T>::map_or (in /nix/store/j67nja68wk0932kpcc2gir95zgvn4gix-mesa-25.0.0-asahi-opencl/lib/libRusticlOpenCL.so.1.0.0)
==118477== by 0x216FD91F: rusticl::core::event::Event::call (in /nix/store/j67nja68wk0932kpcc2gir95zgvn4gix-mesa-25.0.0-asahi-opencl/lib/libRusticlOpenCL.so.1.0.0)
==118477== by 0x21704B23: rusticl::core::queue::Queue::new::{{closure}} (in /nix/store/j67nja68wk0932kpcc2gir95zgvn4gix-mesa-25.0.0-asahi-opencl/lib/libRusticlOpenCL.so.1.0.0)
==118477== by 0x216A15AF: std::sys::backtrace::__rust_begin_short_backtrace (in /nix/store/j67nja68wk0932kpcc2gir95zgvn4gix-mesa-25.0.0-asahi-opencl/lib/libRusticlOpenCL.so.1.0.0)
==118477== If you believe this happened as a result of a stack
==118477== overflow in your program's main thread (unlikely but
==118477== possible), you can try to increase the size of the
==118477== main thread stack using the --main-stacksize= flag.
==118477== The main thread stack size used in this run was 8388608.
==118477==
==118477== HEAP SUMMARY:
==118477== in use at exit: 200,911,965 bytes in 142,087 blocks
==118477== total heap usage: 425,014 allocs, 282,927 frees, 322,863,975 bytes allocated
==118477==
==118477== LEAK SUMMARY:
==118477== definitely lost: 170,726 bytes in 43 blocks
==118477== indirectly lost: 0 bytes in 0 blocks
==118477== possibly lost: 30,644,137 bytes in 113,615 blocks
==118477== still reachable: 170,097,102 bytes in 28,429 blocks
==118477== suppressed: 0 bytes in 0 blocks
==118477== Rerun with --leak-check=full to see details of leaked memory
==118477==
==118477== For lists of detected and suppressed errors, rerun with: -s
==118477== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)