ROCm-OpenCL-Runtime icon indicating copy to clipboard operation
ROCm-OpenCL-Runtime copied to clipboard

clinfo segfault

Open piec opened this issue 6 years ago • 0 comments

Hello,

I'm trying to use ROCm-OpenCL-Runtime on my pc, but I can't make it work so far. I'm using an up to date Arch Linux, 2ng generation Threadripper, with an RX570 GPU. I'm using X with the amdgpu driver.

I get a segfault just by running clinfo (see the info below). Any ideas on how I could fix that? Thanks

LD_LIBRARY_PATH=$PWD/lib LOG_LEVEL=100 ./bin/clinfo
:3:../runtime/device/rocm/rocdevice.cpp:393: Initializing HSA stack.
:2:../runtime/device/devprogram.cpp:162: /home/pierre/apps/rocm2/opencl/build/bin/clang not found
:2:../runtime/device/devprogram.cpp:162: /home/pierre/apps/rocm2/opencl/build/bin/llvm-link not found
:2:../runtime/device/devprogram.cpp:162: /home/pierre/apps/rocm2/opencl/build/bin/ld.lld not found
:2:../runtime/device/devprogram.cpp:183: Could not find the Clang binary in /home/pierre/apps/rocm2/opencl/build/bin
:2:../runtime/device/devprogram.cpp:183: Could not find the Clang binary in /home/pierre/apps/rocm2/opencl/build/bin
:3:../runtime/device/rocm/rocdevice.cpp:1825: number of allocated hardware queues: 0, maximum: 4
:3:../runtime/device/rocm/rocdevice.cpp:1860: created hardware queue 0x103b000 with size 1024
:1:../runtime/device/rocm/rocdevice.cpp:1578: Failed creating memory
:1:../runtime/platform/memory.cpp:307: Video memory allocation failed!
:1:../runtime/platform/memory.cpp:271: Can't allocate memory size - 0x00001000 bytes!
:1:../runtime/device/rocm/rocvirtual.cpp:671: Could not create BlitManager!
:3:../runtime/device/rocm/rocdevice.cpp:1880: deleting hardware queue 0x103b000 with refCount 0
:1:../runtime/device/rocm/rocdevice.cpp:1812: Couldn't create the device transfer manager!
[1]    1405987 segmentation fault (core dumped)  LD_LIBRARY_PATH=$PWD/lib LOG_LEVEL=100 ./bin/clinfo
% uname -a
Linux powa 5.3.13-arch1-1 #1 SMP PREEMPT Sun, 24 Nov 2019 10:15:50 +0000 x86_64 GNU/Linux
% lspci | grep VGA
41:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] (rev ef)
% dmesg | grep kfd
[   13.205226] kfd kfd: Allocated 3969056 bytes on gart
[   13.206545] kfd kfd: added device 1002:67df
% ls /etc/OpenCL/vendors/
amdocl64.icd
% cat /etc/OpenCL/vendors/amdocl64.icd
libamdocl64.so

I'm using rocr-runtime and roct-thunk-interface 2.10.0

gdb
% gdb ./bin/clinfo
GNU gdb (GDB) 8.3.1
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./bin/clinfo...
(gdb) r
Starting program: /home/pierre/apps/rocm2/opencl/build/bin/clinfo
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
/usr/lib/../share/gcc-9.2.0/python/libstdcxx/v6/xmethods.py:731: SyntaxWarning: list indices must be integers or slices, not str; perhaps you missed a comma?
  refcounts = ['_M_refcount']['_M_pi']
:3:../runtime/device/rocm/rocdevice.cpp:393: Initializing HSA stack.
[New Thread 0x7fffeca48700 (LWP 1407177)]
:2:../runtime/device/devprogram.cpp:162: /home/pierre/apps/rocm2/opencl/build/bin/clang not found
:2:../runtime/device/devprogram.cpp:162: /home/pierre/apps/rocm2/opencl/build/bin/llvm-link not found
:2:../runtime/device/devprogram.cpp:162: /home/pierre/apps/rocm2/opencl/build/bin/ld.lld not found
:2:../runtime/device/devprogram.cpp:183: Could not find the Clang binary in /home/pierre/apps/rocm2/opencl/build/bin
:2:../runtime/device/devprogram.cpp:183: Could not find the Clang binary in /home/pierre/apps/rocm2/opencl/build/bin
[New Thread 0x7fffe7fff700 (LWP 1407182)]
[New Thread 0x7fffe77fe700 (LWP 1407183)]
[New Thread 0x7fffe6ffd700 (LWP 1407184)]
[New Thread 0x7fffe67fc700 (LWP 1407185)]
[New Thread 0x7fffe5ffb700 (LWP 1407186)]
[New Thread 0x7fffe57fa700 (LWP 1407187)]
[New Thread 0x7fffe4ff9700 (LWP 1407188)]
[New Thread 0x7fffcffff700 (LWP 1407189)]
[New Thread 0x7fffcf7fe700 (LWP 1407190)]
[New Thread 0x7fffceffd700 (LWP 1407191)]
[New Thread 0x7fffce7fc700 (LWP 1407192)]
[New Thread 0x7fffcdffb700 (LWP 1407193)]
[New Thread 0x7fffcd7fa700 (LWP 1407194)]
[New Thread 0x7fffccff9700 (LWP 1407195)]
[New Thread 0x7fffc7fff700 (LWP 1407196)]
[New Thread 0x7fffc77fe700 (LWP 1407197)]
[New Thread 0x7fffc6ffd700 (LWP 1407198)]
[New Thread 0x7fffc67fc700 (LWP 1407199)]
[New Thread 0x7fffc5ffb700 (LWP 1407200)]
[New Thread 0x7fffc57fa700 (LWP 1407201)]
[New Thread 0x7fffc4ff9700 (LWP 1407202)]
[New Thread 0x7fffc47f8700 (LWP 1407203)]
[New Thread 0x7fffc3ff7700 (LWP 1407204)]
[New Thread 0x7fffc37f6700 (LWP 1407205)]
[New Thread 0x7fffc2ff5700 (LWP 1407206)]
[New Thread 0x7fffc27f4700 (LWP 1407207)]
[New Thread 0x7fffc1ff3700 (LWP 1407208)]
[New Thread 0x7fffc17f2700 (LWP 1407209)]
[New Thread 0x7fffc0ff1700 (LWP 1407210)]
[New Thread 0x7fffc07f0700 (LWP 1407211)]
[New Thread 0x7fffbffef700 (LWP 1407212)]
[New Thread 0x7fffbf7ee700 (LWP 1407213)]
:3:../runtime/device/rocm/rocdevice.cpp:1825: number of allocated hardware queues: 0, maximum: 4
:3:../runtime/device/rocm/rocdevice.cpp:1860: created hardware queue 0x103b000 with size 1024
:1:../runtime/device/rocm/rocdevice.cpp:1578: Failed creating memory
:1:../runtime/platform/memory.cpp:307: Video memory allocation failed!
:1:../runtime/platform/memory.cpp:271: Can't allocate memory size - 0x00001000 bytes!
:1:../runtime/device/rocm/rocvirtual.cpp:671: Could not create BlitManager!
:3:../runtime/device/rocm/rocdevice.cpp:1880: deleting hardware queue 0x103b000 with refCount 0
:1:../runtime/device/rocm/rocdevice.cpp:1812: Couldn't create the device transfer manager!

Thread 1 "clinfo" received signal SIGSEGV, Segmentation fault.
0x00007fffee21c44c in roc::VirtualGPU::enableSyncBlit (this=0x0) at ../runtime/device/rocm/rocvirtual.cpp:2309
2309    void VirtualGPU::enableSyncBlit() const { blitMgr_->enableSynchronization(); }
(gdb) bt
#0  0x00007fffee21c44c in roc::VirtualGPU::enableSyncBlit (this=0x0) at ../runtime/device/rocm/rocvirtual.cpp:2309
#1  0x00007fffee2013b6 in roc::Device::xferQueue (this=0x555555618480) at ../runtime/device/rocm/rocdevice.cpp:1815
#2  0x00007fffee1fd680 in roc::Device::create (this=0x555555618480, sramEccEnabled=false) at ../runtime/device/rocm/rocdevice.cpp:768
#3  0x00007fffee1fc503 in roc::Device::init () at ../runtime/device/rocm/rocdevice.cpp:546
#4  0x00007fffee1853a0 in amd::Device::init () at ../runtime/device/device.cpp:162
#5  0x00007fffee1c08eb in amd::Runtime::init () at ../runtime/platform/runtime.cpp:58
#6  0x00007fffee25a221 in ShouldLoadPlatform () at ../api/opencl/amdocl/cl_icd.cpp:205
#7  0x00007fffee25a331 in <lambda()>::operator()(void) const (__closure=0x7fffffffd220) at ../api/opencl/amdocl/cl_icd.cpp:255
#8  0x00007fffee25a7a0 in std::__invoke_impl<void, clIcdGetPlatformIDsKHR(cl_uint, _cl_platform_id**, cl_uint*)::<lambda()> >(std::__invoke_other, <lambda()> &&) (__f=...) at /usr/include/c++/9.2.0/bits/invoke.h:60
#9  0x00007fffee25a75b in std::__invoke<clIcdGetPlatformIDsKHR(cl_uint, _cl_platform_id**, cl_uint*)::<lambda()> >(<lambda()> &&) (__fn=...) at /usr/include/c++/9.2.0/bits/invoke.h:95
#10 0x00007fffee25a61f in std::<lambda()>::operator()(void) const (this=0x7fffffffd1d0) at /usr/include/c++/9.2.0/mutex:671
#11 0x00007fffee25a649 in std::<lambda()>::operator()(void) const (this=0x0) at /usr/include/c++/9.2.0/mutex:676
#12 0x00007fffee25a65a in std::<lambda()>::_FUN(void) () at /usr/include/c++/9.2.0/mutex:676
#13 0x00007ffff7f4987f in __pthread_once_slow () from /usr/lib/libpthread.so.0
#14 0x00007fffee25a0e8 in __gthread_once (__once=0x7ffff79ce6f0 <clIcdGetPlatformIDsKHR::initOnce>, __func=0x7ffff7e17ff0 <std::__once_proxy()>) at /usr/include/c++/9.2.0/x86_64-pc-linux-gnu/bits/gthr-default.h:700
#15 0x00007fffee25a6ef in std::call_once<clIcdGetPlatformIDsKHR(cl_uint, _cl_platform_id**, cl_uint*)::<lambda()> >(std::once_flag &, <lambda()> &&) (__once=..., __f=...) at /usr/include/c++/9.2.0/mutex:683
#16 0x00007fffee25a39a in clIcdGetPlatformIDsKHR (num_entries=0, platforms=0x0, num_platforms=0x7fffffffd254) at ../api/opencl/amdocl/cl_icd.cpp:255
#17 0x00007ffff7fc436c in khrIcdVendorAdd (libraryName=0x555555598120 "libamdocl64.so") at ../api/opencl/khronos/icd/loader/icd.c:87
#18 0x00007ffff7fc7d98 in khrIcdOsVendorsEnumerate () at ../api/opencl/khronos/icd/loader/linux/icd_linux.c:125
#19 0x00007ffff7f4987f in __pthread_once_slow () from /usr/lib/libpthread.so.0
#20 0x00007ffff7fc7e26 in khrIcdOsVendorsEnumerateOnce () at ../api/opencl/khronos/icd/loader/linux/icd_linux.c:149
#21 0x00007ffff7fc4262 in khrIcdInitialize () at ../api/opencl/khronos/icd/loader/icd.c:31
#22 0x00007ffff7fc47d6 in clGetPlatformIDs (num_entries=0, platforms=0x0, num_platforms=0x7fffffffd3f8) at ../api/opencl/khronos/icd/loader/icd_dispatch.c:34
#23 0x000055555555cbe0 in cl::Platform::get (platforms=0x7fffffffd4f0) at ../api/opencl/khronos/headers/opencl2.2/CL/cl2.hpp:2480
#24 0x00005555555577f5 in main (argc=1, argv=0x7fffffffd8f8) at ../tools/clinfo/clinfo.cpp:146
(gdb) up
#1  0x00007fffee2013b6 in roc::Device::xferQueue (this=0x555555618480) at ../runtime/device/rocm/rocdevice.cpp:1815
1815      xferQueue_->enableSyncBlit();
(gdb) list
1810        thisDevice->xferQueue_ = reinterpret_cast<VirtualGPU*>(thisDevice->createVirtualDevice());
1811        if (!xferQueue_) {
1812          LogError("Couldn't create the device transfer manager!");
1813        }
1814      }
1815      xferQueue_->enableSyncBlit();
1816      return xferQueue_;
1817    }
1818    bool Device::SetClockMode(const cl_set_device_clock_mode_input_amd setClockModeInput, cl_set_device_clock_mode_output_amd* pSetClockModeOutput) {
1819      bool result = true;
(gdb) p xferQueue_
$1 = (roc::VirtualGPU *) 0x0
(gdb)

gdb Device::createMemory fails
:3:../runtime/device/rocm/rocdevice.cpp:1825: number of allocated hardware queues: 0, maximum: 4
:3:../runtime/device/rocm/rocdevice.cpp:1860: created hardware queue 0x103b000 with size 1024

Thread 1 "clinfo" hit Breakpoint 1, roc::Device::createMemory (this=0x555555618480, owner=...) at ../runtime/device/rocm/rocdevice.cpp:1561
1561    device::Memory* Device::createMemory(amd::Memory& owner) const {
(gdb) n
1562      roc::Memory* memory = nullptr;
(gdb)
1563      if (owner.asBuffer()) {
(gdb)
1564        memory = new roc::Buffer(*this, owner);
(gdb)
1571      if (memory == nullptr) {
(gdb)
1575      bool result = memory->create();
(gdb) p memory
$1 = (roc::Memory *) 0x555555bfd770
(gdb) s
roc::Buffer::create (this=0x555555bfd770) at ../runtime/device/rocm/rocmemory.cpp:677
677     bool Buffer::create() {
(gdb) n
678       if (owner() == nullptr) {
(gdb) list
673         }
674       }
675     }
676
677     bool Buffer::create() {
678       if (owner() == nullptr) {
679         deviceMemory_ = dev().hostAlloc(size(), 1, false);
680         if (deviceMemory_ != nullptr) {
681           flags_ |= HostMemoryDirectAccess;
682           return true;
(gdb) n
688       cl_mem_flags memFlags = owner()->getMemFlags();
(gdb)
690       if (owner()->getSvmPtr() != nullptr) {
(gdb)
729       if (owner()->isInterop()) return createInteropBuffer(GL_ARRAY_BUFFER, 0);
(gdb)
731       if (nullptr != owner()->parent()) {
(gdb) n
770       if (!(memFlags & (CL_MEM_USE_HOST_PTR | CL_MEM_ALLOC_HOST_PTR))) {
(gdb)
823       assert(owner()->getHostMem() != nullptr);
(gdb) n
825       flags_ |= HostMemoryDirectAccess;
(gdb) p flags_
$2 = 0
(gdb) n
827       if (dev().agent_profile() == HSA_PROFILE_FULL) {
(gdb) n
837       if (owner()->getSvmPtr() != owner()->getHostMem()) {
(gdb)
838         if (memFlags & (CL_MEM_USE_HOST_PTR | CL_MEM_ALLOC_HOST_PTR)) {
(gdb) n
839           hsa_amd_memory_pool_t pool = (memFlags & CL_MEM_SVM_ATOMICS)? dev().SystemSegment() : dev().SystemCoarseSegment();
(gdb) n
840           hsa_status_t status = hsa_amd_memory_lock_to_pool(owner()->getHostMem(), owner()->getSize(), nullptr,
(gdb) p pool
$3 = {handle = 0}
(gdb) n
842           if (status != HSA_STATUS_SUCCESS) {
(gdb) p status
$4 = 40
(gdb) n
843             deviceMemory_ = nullptr;
(gdb)
852       return deviceMemory_ != nullptr;
(gdb)
853     }
(gdb)
roc::Device::createMemory (this=0x555555618480, owner=...) at ../runtime/device/rocm/rocdevice.cpp:1577
1577      if (!result) {
(gdb) p restult
No symbol "restult" in current context.
(gdb) Quit
(gdb) p result
$5 = false
(gdb)

repo status
% repo info
Manifest branch: master
Manifest merge branch: refs/heads/master
Manifest groups: all,-notdefault
----------------------------
Project: ROCm-OpenCL-Runtime
Mount path: /home/pierre/apps/rocm2/opencl
Current revision: 87a4db447593f7563f0dbbfcab3b4d4150e03cf6
Local Branches: 0
----------------------------
Project: OpenCL-ICD-Loader
Mount path: /home/pierre/apps/rocm2/opencl/api/opencl/khronos/icd
Current revision: 978b4b3a29a3aebc86ce9315d5c5963e88722d03
Local Branches: 0
----------------------------
Project: ROCm-OpenCL-Driver
Mount path: /home/pierre/apps/rocm2/opencl/compiler/driver
Current revision: ac9457a6be56b8368e95ce55f65e193a77166181
Local Branches: 0
----------------------------
Project: llvm
Mount path: /home/pierre/apps/rocm2/opencl/compiler/llvm
Current revision: 7b3a23a98c2e869965f64657ee11a7cb13feffa5
Local Branches: 0
----------------------------
Project: clang
Mount path: /home/pierre/apps/rocm2/opencl/compiler/llvm/tools/clang
Current revision: a09d37e345861d68f9768939e485d265f4fcb0ce
Local Branches: 0
----------------------------
Project: lld
Mount path: /home/pierre/apps/rocm2/opencl/compiler/llvm/tools/lld
Current revision: e5162a691f6596aa1f165305ebeeffce93597968
Local Branches: 0
----------------------------
Project: ROCm-Device-Libs
Mount path: /home/pierre/apps/rocm2/opencl/library/amdgcn
Current revision: c3967062378a1a33b66d8ff10455f4d72d567939
Local Branches: 0
----------------------------

piec avatar Dec 02 '19 16:12 piec