Android: clpeak segmentation fault
Hi, it seems that the latest clvk causes segmentation fault on Android for clpeak at float8
GDB
Starting program: /data/data/com.termux/files/usr-arm/bin/clpeak
[New LWP 10448]
[New LWP 10449]
[New LWP 10450]
[New LWP 10451]
[New LWP 10452]
[New LWP 10453]
[New LWP 10454]
[New LWP 10455]
[New LWP 10456]
[New LWP 10457]
[New LWP 10458]
[LWP 10458 exited]
[New LWP 10459]
Thread 1 "clpeak" received signal SIGSEGV, Segmentation fault.
0xf25d5198 in cmpbep_loop_get_max_iter ()
from /vendor/lib/egl/libGLES_mali.so
#0 0xf25d5198 in cmpbep_loop_get_max_iter ()
from /vendor/lib/egl/libGLES_mali.so
No symbol table info available.
#1 0xf25b3738 in loop_unroll () from /vendor/lib/egl/libGLES_mali.so
No symbol table info available.
#2 0xf25c292e in cmpbep_run_pass () from /vendor/lib/egl/libGLES_mali.so
No symbol table info available.
#3 0xf25c2a14 in cmpbep_run_pass_sequence ()
from /vendor/lib/egl/libGLES_mali.so
No symbol table info available.
#4 0xf251ff7a in cmpbe_compile_gles_shader ()
from /vendor/lib/egl/libGLES_mali.so
No symbol table info available.
#5 0xf25379a6 in cmpbe_v2_compile_multiple_shaders ()
from /vendor/lib/egl/libGLES_mali.so
No symbol table info available.
#6 0xf24a1f54 in gfx::compiler::compile_shaders(gfx::shader_set const&, gfx::shader_set&, hal::shader_language, gfx::shader_state const&, gfx::pipeline_cache*, gfx::mem_allocator&) () from /vendor/lib/egl/libGLES_mali.so
No symbol table info available.
#7 0xf232202e in vulkan::compute_pipeline::init(gfx::device*, VkComputePipelineCreateInfo const&, gfx::host_mem_allocator const&, gfx::host_mem_allocator&) () from /vendor/lib/egl/libGLES_mali.so
No symbol table info available.
#8 0xf2321db6 in vkCreateComputePipelines ()
from /vendor/lib/egl/libGLES_mali.so
No symbol table info available.
#9 0xf3f20df2 in vulkan::api::(anonymous namespace)::CreateComputePipelines(VkDevice_T*, unsigned long long, unsigned int, VkComputePipelineCreateInfo const*, VkAllocationCallbacks const*, unsigned long long*) ()
from /system/lib/libvulkan.so
No symbol table info available.
#10 0xf4788e4e in ?? ()
from /data/data/com.termux/files/usr-arm/lib/clvk/libOpenCL.so
No symbol table info available.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
CLVK_LOG=4 https://gist.github.com/truboxl/900af658448056aa083ae2a4cbecd47a
In the meantime, I will try to bisect commits between latest: cf96214811598681ad9080f92b777d0feec2b5ff last known working (or the version that last packaged by me): bc9398b486
Hi, thanks for the report. This looks like a crash in Arm's Mali driver. A bisection would be very useful.
cf96214811598681ad9080f92b777d0feec2b5ff slow and segfault bdb875c4cf42d181152de1740670c99a0f7a95bc fast and no segfault 75155fca9574e68ef97728e066e07dd6188040a4 fast and no segfault 1d7afd407bdb9ce5e6e8db2b296db62274e267c7 fast and no segfault 4a7c9563931a33803eab12a38868947a63c86e15 fast and no segfault f5e50bf7690218d74bd4cd5bad35d7559ba8fbaa fast and no segfault 83858a157986bf20d00343298fd6895cbccea1d6 fast and no segfault 65fef8216d53ef692ddc21f01fa4d11b37439325 fast and no segfault fe59fb62338ae50f75e78792d2f40e960452314d slow and segfault c4b145f63b719e92128e109e3075754f1222a4f4 slow and segfault e9f2421bafb211cc1e1c105f2dfaa82dd8985091 slow and segfault cf852aa2ba0c73be3b8bdf5a92eee338a5180513 slow and segfault 5e9d766d0720975c7e134b184e2b52e04a799265 slow and segfault 514c6cbf32b4ca21c9090a6b8f90c2e69c85d342 slow and segfault 47b1c9bd789e9873bdc2eb5cb1293f8631ba2557 slow and segfault 6cc5e368528598f68664e0c93d2db5d1ff5cc7be slow and segfault e5123dd6b870a1c45f1bdd59302dcd17869cb449 slow and segfault 32bfeabf2bd0feb99538733ad6cd4929c787b75e slow and segfault d746cea70e45b5fd56fbf60009400aabf7596747 slow and segfault 03b1630b8f1b69642f0ffa7e086636d8059203a4 slow and segfault bd606e480bbb2f2ace60d63790a94b82c5bc41e7 slow and segfault 8ae5fb1a8988ba395707996f4321af1ce0984f96 slow and segfault 83855a27dc927be559653f450ddd71d9f2096298 slow and segfault 170955f1c04561c9962b542981a4844052ebaf4b slow and segfault 5a4b3e3423688705b4d609f75540f2e89c7ebea9 slow and segfault 6455065da0f3368bd0f34b3c85c11ff74f5cc2be slow and segfault 2c382f90af83ecbe4095c2b55acd8505863a6890 slow and segfault fd610c6ced4fa79ad2f001acd7b273f525dc4751 slow and no segfault bc9398b486243517c8a4a75dfd517c4146a831c1 slow and no segfault
clpeak has these tests that show the performance: global memory bandwidth single precision compute half precision compute integer compute integer compute fast 24bit
slow = single and double digits fast = four digits
By the way I have removed the patch that reverts the behaviour. This means low end devices (phones!) still breaks though. Workaround is probably dont run global memory bandwidth clpeak benchmark test on those devices.
So the one I packaged is as close to upstream as possible. https://github.com/termux/termux-packages/tree/master/packages/clvk I will close this issue now.