ROCm-OpenCL-Runtime
ROCm-OpenCL-Runtime copied to clipboard
clinfo segfault
Hello,
I'm trying to use ROCm-OpenCL-Runtime on my pc, but I can't make it work so far. I'm using an up to date Arch Linux, 2ng generation Threadripper, with an RX570 GPU. I'm using X with the amdgpu driver.
I get a segfault just by running clinfo (see the info below). Any ideas on how I could fix that? Thanks
LD_LIBRARY_PATH=$PWD/lib LOG_LEVEL=100 ./bin/clinfo
:3:../runtime/device/rocm/rocdevice.cpp:393: Initializing HSA stack.
:2:../runtime/device/devprogram.cpp:162: /home/pierre/apps/rocm2/opencl/build/bin/clang not found
:2:../runtime/device/devprogram.cpp:162: /home/pierre/apps/rocm2/opencl/build/bin/llvm-link not found
:2:../runtime/device/devprogram.cpp:162: /home/pierre/apps/rocm2/opencl/build/bin/ld.lld not found
:2:../runtime/device/devprogram.cpp:183: Could not find the Clang binary in /home/pierre/apps/rocm2/opencl/build/bin
:2:../runtime/device/devprogram.cpp:183: Could not find the Clang binary in /home/pierre/apps/rocm2/opencl/build/bin
:3:../runtime/device/rocm/rocdevice.cpp:1825: number of allocated hardware queues: 0, maximum: 4
:3:../runtime/device/rocm/rocdevice.cpp:1860: created hardware queue 0x103b000 with size 1024
:1:../runtime/device/rocm/rocdevice.cpp:1578: Failed creating memory
:1:../runtime/platform/memory.cpp:307: Video memory allocation failed!
:1:../runtime/platform/memory.cpp:271: Can't allocate memory size - 0x00001000 bytes!
:1:../runtime/device/rocm/rocvirtual.cpp:671: Could not create BlitManager!
:3:../runtime/device/rocm/rocdevice.cpp:1880: deleting hardware queue 0x103b000 with refCount 0
:1:../runtime/device/rocm/rocdevice.cpp:1812: Couldn't create the device transfer manager!
[1] 1405987 segmentation fault (core dumped) LD_LIBRARY_PATH=$PWD/lib LOG_LEVEL=100 ./bin/clinfo
% uname -a
Linux powa 5.3.13-arch1-1 #1 SMP PREEMPT Sun, 24 Nov 2019 10:15:50 +0000 x86_64 GNU/Linux
% lspci | grep VGA
41:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] (rev ef)
% dmesg | grep kfd
[ 13.205226] kfd kfd: Allocated 3969056 bytes on gart
[ 13.206545] kfd kfd: added device 1002:67df
% ls /etc/OpenCL/vendors/
amdocl64.icd
% cat /etc/OpenCL/vendors/amdocl64.icd
libamdocl64.so
I'm using rocr-runtime and roct-thunk-interface 2.10.0
gdb
% gdb ./bin/clinfo
GNU gdb (GDB) 8.3.1
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./bin/clinfo...
(gdb) r
Starting program: /home/pierre/apps/rocm2/opencl/build/bin/clinfo
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
/usr/lib/../share/gcc-9.2.0/python/libstdcxx/v6/xmethods.py:731: SyntaxWarning: list indices must be integers or slices, not str; perhaps you missed a comma?
refcounts = ['_M_refcount']['_M_pi']
:3:../runtime/device/rocm/rocdevice.cpp:393: Initializing HSA stack.
[New Thread 0x7fffeca48700 (LWP 1407177)]
:2:../runtime/device/devprogram.cpp:162: /home/pierre/apps/rocm2/opencl/build/bin/clang not found
:2:../runtime/device/devprogram.cpp:162: /home/pierre/apps/rocm2/opencl/build/bin/llvm-link not found
:2:../runtime/device/devprogram.cpp:162: /home/pierre/apps/rocm2/opencl/build/bin/ld.lld not found
:2:../runtime/device/devprogram.cpp:183: Could not find the Clang binary in /home/pierre/apps/rocm2/opencl/build/bin
:2:../runtime/device/devprogram.cpp:183: Could not find the Clang binary in /home/pierre/apps/rocm2/opencl/build/bin
[New Thread 0x7fffe7fff700 (LWP 1407182)]
[New Thread 0x7fffe77fe700 (LWP 1407183)]
[New Thread 0x7fffe6ffd700 (LWP 1407184)]
[New Thread 0x7fffe67fc700 (LWP 1407185)]
[New Thread 0x7fffe5ffb700 (LWP 1407186)]
[New Thread 0x7fffe57fa700 (LWP 1407187)]
[New Thread 0x7fffe4ff9700 (LWP 1407188)]
[New Thread 0x7fffcffff700 (LWP 1407189)]
[New Thread 0x7fffcf7fe700 (LWP 1407190)]
[New Thread 0x7fffceffd700 (LWP 1407191)]
[New Thread 0x7fffce7fc700 (LWP 1407192)]
[New Thread 0x7fffcdffb700 (LWP 1407193)]
[New Thread 0x7fffcd7fa700 (LWP 1407194)]
[New Thread 0x7fffccff9700 (LWP 1407195)]
[New Thread 0x7fffc7fff700 (LWP 1407196)]
[New Thread 0x7fffc77fe700 (LWP 1407197)]
[New Thread 0x7fffc6ffd700 (LWP 1407198)]
[New Thread 0x7fffc67fc700 (LWP 1407199)]
[New Thread 0x7fffc5ffb700 (LWP 1407200)]
[New Thread 0x7fffc57fa700 (LWP 1407201)]
[New Thread 0x7fffc4ff9700 (LWP 1407202)]
[New Thread 0x7fffc47f8700 (LWP 1407203)]
[New Thread 0x7fffc3ff7700 (LWP 1407204)]
[New Thread 0x7fffc37f6700 (LWP 1407205)]
[New Thread 0x7fffc2ff5700 (LWP 1407206)]
[New Thread 0x7fffc27f4700 (LWP 1407207)]
[New Thread 0x7fffc1ff3700 (LWP 1407208)]
[New Thread 0x7fffc17f2700 (LWP 1407209)]
[New Thread 0x7fffc0ff1700 (LWP 1407210)]
[New Thread 0x7fffc07f0700 (LWP 1407211)]
[New Thread 0x7fffbffef700 (LWP 1407212)]
[New Thread 0x7fffbf7ee700 (LWP 1407213)]
:3:../runtime/device/rocm/rocdevice.cpp:1825: number of allocated hardware queues: 0, maximum: 4
:3:../runtime/device/rocm/rocdevice.cpp:1860: created hardware queue 0x103b000 with size 1024
:1:../runtime/device/rocm/rocdevice.cpp:1578: Failed creating memory
:1:../runtime/platform/memory.cpp:307: Video memory allocation failed!
:1:../runtime/platform/memory.cpp:271: Can't allocate memory size - 0x00001000 bytes!
:1:../runtime/device/rocm/rocvirtual.cpp:671: Could not create BlitManager!
:3:../runtime/device/rocm/rocdevice.cpp:1880: deleting hardware queue 0x103b000 with refCount 0
:1:../runtime/device/rocm/rocdevice.cpp:1812: Couldn't create the device transfer manager!
Thread 1 "clinfo" received signal SIGSEGV, Segmentation fault.
0x00007fffee21c44c in roc::VirtualGPU::enableSyncBlit (this=0x0) at ../runtime/device/rocm/rocvirtual.cpp:2309
2309 void VirtualGPU::enableSyncBlit() const { blitMgr_->enableSynchronization(); }
(gdb) bt
#0 0x00007fffee21c44c in roc::VirtualGPU::enableSyncBlit (this=0x0) at ../runtime/device/rocm/rocvirtual.cpp:2309
#1 0x00007fffee2013b6 in roc::Device::xferQueue (this=0x555555618480) at ../runtime/device/rocm/rocdevice.cpp:1815
#2 0x00007fffee1fd680 in roc::Device::create (this=0x555555618480, sramEccEnabled=false) at ../runtime/device/rocm/rocdevice.cpp:768
#3 0x00007fffee1fc503 in roc::Device::init () at ../runtime/device/rocm/rocdevice.cpp:546
#4 0x00007fffee1853a0 in amd::Device::init () at ../runtime/device/device.cpp:162
#5 0x00007fffee1c08eb in amd::Runtime::init () at ../runtime/platform/runtime.cpp:58
#6 0x00007fffee25a221 in ShouldLoadPlatform () at ../api/opencl/amdocl/cl_icd.cpp:205
#7 0x00007fffee25a331 in <lambda()>::operator()(void) const (__closure=0x7fffffffd220) at ../api/opencl/amdocl/cl_icd.cpp:255
#8 0x00007fffee25a7a0 in std::__invoke_impl<void, clIcdGetPlatformIDsKHR(cl_uint, _cl_platform_id**, cl_uint*)::<lambda()> >(std::__invoke_other, <lambda()> &&) (__f=...) at /usr/include/c++/9.2.0/bits/invoke.h:60
#9 0x00007fffee25a75b in std::__invoke<clIcdGetPlatformIDsKHR(cl_uint, _cl_platform_id**, cl_uint*)::<lambda()> >(<lambda()> &&) (__fn=...) at /usr/include/c++/9.2.0/bits/invoke.h:95
#10 0x00007fffee25a61f in std::<lambda()>::operator()(void) const (this=0x7fffffffd1d0) at /usr/include/c++/9.2.0/mutex:671
#11 0x00007fffee25a649 in std::<lambda()>::operator()(void) const (this=0x0) at /usr/include/c++/9.2.0/mutex:676
#12 0x00007fffee25a65a in std::<lambda()>::_FUN(void) () at /usr/include/c++/9.2.0/mutex:676
#13 0x00007ffff7f4987f in __pthread_once_slow () from /usr/lib/libpthread.so.0
#14 0x00007fffee25a0e8 in __gthread_once (__once=0x7ffff79ce6f0 <clIcdGetPlatformIDsKHR::initOnce>, __func=0x7ffff7e17ff0 <std::__once_proxy()>) at /usr/include/c++/9.2.0/x86_64-pc-linux-gnu/bits/gthr-default.h:700
#15 0x00007fffee25a6ef in std::call_once<clIcdGetPlatformIDsKHR(cl_uint, _cl_platform_id**, cl_uint*)::<lambda()> >(std::once_flag &, <lambda()> &&) (__once=..., __f=...) at /usr/include/c++/9.2.0/mutex:683
#16 0x00007fffee25a39a in clIcdGetPlatformIDsKHR (num_entries=0, platforms=0x0, num_platforms=0x7fffffffd254) at ../api/opencl/amdocl/cl_icd.cpp:255
#17 0x00007ffff7fc436c in khrIcdVendorAdd (libraryName=0x555555598120 "libamdocl64.so") at ../api/opencl/khronos/icd/loader/icd.c:87
#18 0x00007ffff7fc7d98 in khrIcdOsVendorsEnumerate () at ../api/opencl/khronos/icd/loader/linux/icd_linux.c:125
#19 0x00007ffff7f4987f in __pthread_once_slow () from /usr/lib/libpthread.so.0
#20 0x00007ffff7fc7e26 in khrIcdOsVendorsEnumerateOnce () at ../api/opencl/khronos/icd/loader/linux/icd_linux.c:149
#21 0x00007ffff7fc4262 in khrIcdInitialize () at ../api/opencl/khronos/icd/loader/icd.c:31
#22 0x00007ffff7fc47d6 in clGetPlatformIDs (num_entries=0, platforms=0x0, num_platforms=0x7fffffffd3f8) at ../api/opencl/khronos/icd/loader/icd_dispatch.c:34
#23 0x000055555555cbe0 in cl::Platform::get (platforms=0x7fffffffd4f0) at ../api/opencl/khronos/headers/opencl2.2/CL/cl2.hpp:2480
#24 0x00005555555577f5 in main (argc=1, argv=0x7fffffffd8f8) at ../tools/clinfo/clinfo.cpp:146
(gdb) up
#1 0x00007fffee2013b6 in roc::Device::xferQueue (this=0x555555618480) at ../runtime/device/rocm/rocdevice.cpp:1815
1815 xferQueue_->enableSyncBlit();
(gdb) list
1810 thisDevice->xferQueue_ = reinterpret_cast<VirtualGPU*>(thisDevice->createVirtualDevice());
1811 if (!xferQueue_) {
1812 LogError("Couldn't create the device transfer manager!");
1813 }
1814 }
1815 xferQueue_->enableSyncBlit();
1816 return xferQueue_;
1817 }
1818 bool Device::SetClockMode(const cl_set_device_clock_mode_input_amd setClockModeInput, cl_set_device_clock_mode_output_amd* pSetClockModeOutput) {
1819 bool result = true;
(gdb) p xferQueue_
$1 = (roc::VirtualGPU *) 0x0
(gdb)
gdb Device::createMemory fails
:3:../runtime/device/rocm/rocdevice.cpp:1825: number of allocated hardware queues: 0, maximum: 4
:3:../runtime/device/rocm/rocdevice.cpp:1860: created hardware queue 0x103b000 with size 1024
Thread 1 "clinfo" hit Breakpoint 1, roc::Device::createMemory (this=0x555555618480, owner=...) at ../runtime/device/rocm/rocdevice.cpp:1561
1561 device::Memory* Device::createMemory(amd::Memory& owner) const {
(gdb) n
1562 roc::Memory* memory = nullptr;
(gdb)
1563 if (owner.asBuffer()) {
(gdb)
1564 memory = new roc::Buffer(*this, owner);
(gdb)
1571 if (memory == nullptr) {
(gdb)
1575 bool result = memory->create();
(gdb) p memory
$1 = (roc::Memory *) 0x555555bfd770
(gdb) s
roc::Buffer::create (this=0x555555bfd770) at ../runtime/device/rocm/rocmemory.cpp:677
677 bool Buffer::create() {
(gdb) n
678 if (owner() == nullptr) {
(gdb) list
673 }
674 }
675 }
676
677 bool Buffer::create() {
678 if (owner() == nullptr) {
679 deviceMemory_ = dev().hostAlloc(size(), 1, false);
680 if (deviceMemory_ != nullptr) {
681 flags_ |= HostMemoryDirectAccess;
682 return true;
(gdb) n
688 cl_mem_flags memFlags = owner()->getMemFlags();
(gdb)
690 if (owner()->getSvmPtr() != nullptr) {
(gdb)
729 if (owner()->isInterop()) return createInteropBuffer(GL_ARRAY_BUFFER, 0);
(gdb)
731 if (nullptr != owner()->parent()) {
(gdb) n
770 if (!(memFlags & (CL_MEM_USE_HOST_PTR | CL_MEM_ALLOC_HOST_PTR))) {
(gdb)
823 assert(owner()->getHostMem() != nullptr);
(gdb) n
825 flags_ |= HostMemoryDirectAccess;
(gdb) p flags_
$2 = 0
(gdb) n
827 if (dev().agent_profile() == HSA_PROFILE_FULL) {
(gdb) n
837 if (owner()->getSvmPtr() != owner()->getHostMem()) {
(gdb)
838 if (memFlags & (CL_MEM_USE_HOST_PTR | CL_MEM_ALLOC_HOST_PTR)) {
(gdb) n
839 hsa_amd_memory_pool_t pool = (memFlags & CL_MEM_SVM_ATOMICS)? dev().SystemSegment() : dev().SystemCoarseSegment();
(gdb) n
840 hsa_status_t status = hsa_amd_memory_lock_to_pool(owner()->getHostMem(), owner()->getSize(), nullptr,
(gdb) p pool
$3 = {handle = 0}
(gdb) n
842 if (status != HSA_STATUS_SUCCESS) {
(gdb) p status
$4 = 40
(gdb) n
843 deviceMemory_ = nullptr;
(gdb)
852 return deviceMemory_ != nullptr;
(gdb)
853 }
(gdb)
roc::Device::createMemory (this=0x555555618480, owner=...) at ../runtime/device/rocm/rocdevice.cpp:1577
1577 if (!result) {
(gdb) p restult
No symbol "restult" in current context.
(gdb) Quit
(gdb) p result
$5 = false
(gdb)
repo status
% repo info
Manifest branch: master
Manifest merge branch: refs/heads/master
Manifest groups: all,-notdefault
----------------------------
Project: ROCm-OpenCL-Runtime
Mount path: /home/pierre/apps/rocm2/opencl
Current revision: 87a4db447593f7563f0dbbfcab3b4d4150e03cf6
Local Branches: 0
----------------------------
Project: OpenCL-ICD-Loader
Mount path: /home/pierre/apps/rocm2/opencl/api/opencl/khronos/icd
Current revision: 978b4b3a29a3aebc86ce9315d5c5963e88722d03
Local Branches: 0
----------------------------
Project: ROCm-OpenCL-Driver
Mount path: /home/pierre/apps/rocm2/opencl/compiler/driver
Current revision: ac9457a6be56b8368e95ce55f65e193a77166181
Local Branches: 0
----------------------------
Project: llvm
Mount path: /home/pierre/apps/rocm2/opencl/compiler/llvm
Current revision: 7b3a23a98c2e869965f64657ee11a7cb13feffa5
Local Branches: 0
----------------------------
Project: clang
Mount path: /home/pierre/apps/rocm2/opencl/compiler/llvm/tools/clang
Current revision: a09d37e345861d68f9768939e485d265f4fcb0ce
Local Branches: 0
----------------------------
Project: lld
Mount path: /home/pierre/apps/rocm2/opencl/compiler/llvm/tools/lld
Current revision: e5162a691f6596aa1f165305ebeeffce93597968
Local Branches: 0
----------------------------
Project: ROCm-Device-Libs
Mount path: /home/pierre/apps/rocm2/opencl/library/amdgcn
Current revision: c3967062378a1a33b66d8ff10455f4d72d567939
Local Branches: 0
----------------------------