ROCR-Runtime
ROCR-Runtime copied to clipboard
rocminfo: static void amd::MemoryRegion::FreeKfdMemory(void*, size_t): Assertion `status == HSAKMT_STATUS_SUCCESS' failed.
trafficstars
System information
❯ inxi -GSC -xx
System: Host: ernie Kernel: 5.7.7 x86_64 bits: 64 compiler: gcc v: 10.1.0 Desktop: N/A wm: kwin_x11 dm: SDDM
Distro: Gentoo Base System release 2.7
CPU: Topology: Quad Core model: AMD Ryzen 5 2400G with Radeon Vega Graphics bits: 64 type: MT MCP arch: Zen
L2 cache: 2048 KiB
flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm bogomips: 57518
Speed: 1352 MHz min/max: 1600/3600 MHz Core speeds (MHz): 1: 1352 2: 1352 3: 2973 4: 1351 5: 1352 6: 1352 7: 2974
8: 1352
Graphics: Device-1: Advanced Micro Devices [AMD/ATI] Baffin [Radeon RX 550 640SP / RX 560/560X] vendor: ASUSTeK
driver: amdgpu v: kernel bus ID: 01:00.0 chip ID: 1002:67ff
Device-2: AMD Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series] vendor: ASUSTeK driver: amdgpu v: kernel
bus ID: 0a:00.0 chip ID: 1002:15dd
Display: server: X.Org 1.20.8 driver: amdgpu compositor: kwin_x11 resolution: 2560x1080~60Hz
OpenGL: renderer: AMD RAVEN (DRM 3.37.0 5.7.7 LLVM 10.0.0) v: 4.6 Mesa 20.1.3 direct render: Yes
rocminfo is at version 3.5.0.
Problem
When running gdb rocminfo and typing run, I see:
ROCk module is loaded
Able to open /dev/kfd read-write
[New Thread 0x7ffff779f700 (LWP 106384)]
LoadLib(libhsa-ext-image64.so.1) failed: libhsa-ext-image64.so.1: cannot open shared object file: No such file or directory
=====================
HSA System Attributes
=====================
Runtime Version: 1.1
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
==========
HSA Agents
==========
******* Agent 1
*******
Name: AMD Ryzen 5 2400G with Radeon Vega Graphics
Uuid: CPU-XX
Marketing Name: AMD Ryzen 5 2400G with Radeon Vega Graphics
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU Cache Info:
L1: 32(0x20) KB
Chip ID: 5597(0x15dd)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 3600
BDFID: 2560
Internal Node ID: 0
Compute Unit: 8
SIMDs per CU: 4
Shader Engines: 1
Shader Arrs. per Eng.: 1
WatchPts on Addr. Ranges:4
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 16776832(0xfffe80) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:
N/A
*******
Agent 2
*******
Name: gfx902
Uuid: GPU-XX
Marketing Name: AMD Ryzen 5 2400G with Radeon Vega Graphics
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 4096(0x1000)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 0
Device Type: GPU
Cache Info:
L1: 16(0x10) KB
Chip ID: 5597(0x15dd)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 1250
BDFID: 2560
Internal Node ID: 0
Compute Unit: 11
SIMDs per CU: 4
Shader Engines: 1
Shader Arrs. per Eng.: 1
WatchPts on Addr. Ranges:4
Features: KERNEL_DISPATCH
Fast F16 Operation: FALSE
Wavefront Size: 64(0x40)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 160(0xa0)
Max Work-item Per CU: 10240(0x2800)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Pool Info:
Pool 1
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx902+xnack
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*******
Agent 3
*******
Name: gfx803
Uuid: GPU-XX
Marketing Name: Baffin [Radeon RX 550 640SP / RX 560/560X]
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 4096(0x1000)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 16(0x10) KB
Chip ID: 26623(0x67ff)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 1210
BDFID: 256
Internal Node ID: 1
Compute Unit: 16
SIMDs per CU: 4
Shader Engines: 2
Shader Arrs. per Eng.: 1
WatchPts on Addr. Ranges:4
Features: KERNEL_DISPATCH
Fast F16 Operation: FALSE
Wavefront Size: 64(0x40)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 40(0x28)
Max Work-item Per CU: 2560(0xa00)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 2097152(0x200000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx803
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*** Done ***
rocminfo: /tmp/portage/dev-libs/rocr-runtime-3.5.0/work/ROCR-Runtime-rocm-3.5.0/src/core/runtime/amd_memory_region.cpp:72: static void amd::MemoryRegion::FreeKfdMemory(void*, size_t): Assertion `status == HSAKMT_STATUS_SUCCESS' failed.
Note the failed assertion.
The backtrace at that point is:
#0 0x00007ffff79abf91 in raise () from /usr/lib64/libc.so.6
No symbol table info available.
#1 0x00007ffff7995537 in abort () from /usr/lib64/libc.so.6
No symbol table info available.
#2 0x00007ffff799540f in ?? () from /usr/lib64/libc.so.6
No symbol table info available.
#3 0x00007ffff79a43e2 in __assert_fail () from /usr/lib64/libc.so.6
No symbol table info available.
#4 0x00007ffff7e257d9 in amd::MemoryRegion::FreeKfdMemory(void*, unsigned long) () from /usr/lib64/libhsa-runtime64.so.1
No symbol table info available.
#5 0x00007ffff7e2660d in amd::MemoryRegion::Free(void*, unsigned long) const () from /usr/lib64/libhsa-runtime64.so.1
No symbol table info available.
#6 0x00007ffff7e68f19 in core::Runtime::FreeMemory(void*) () from /usr/lib64/libhsa-runtime64.so.1
No symbol table info available.
#7 0x00007ffff7e68568 in core::Runtime::RegisterAgent(core::Agent*)::{lambda(void*)#2}::operator()(void*) const () from /usr/lib64/libhsa-runtime64.so.1
No symbol table info available.
#8 0x00007ffff7e70546 in void std::__invoke_impl<void, core::Runtime::RegisterAgent(core::Agent*)::{lambda(void*)#2}&, void*>(std::__invoke_other, core::Runtime::RegisterAgent(core::Agent*)::{lambda(void*)#2}&, void*&&) () from /usr/lib64/libhsa-runtime64.so.1
No symbol table info available.
#9 0x00007ffff7e7025c in std::enable_if<std::__and_<std::is_void<void>, std::__is_invocable<core::Runtime::RegisterAgent(core::Agent*)::{lambda(void*)#2}&, void*> >::value, std::is_void>::type std::__invoke_r<void, core::Runtime::RegisterAgent(core::Agent*)::{lambda(void*)#2}&, void*>(std::__is_invocable&&, (core::Runtime::RegisterAgent(core::Agent*)::{lambda(v
oid*)#2}&)...) () from /usr/lib64/libhsa-runtime64.so.1
No symbol table info available.
#10 0x00007ffff7e6fd65 in std::_Function_handler<void (void*), core::Runtime::RegisterAgent(core::Agent*)::{lambda(void*)#2}>::_M_invoke(std::_Any_data const&, void*&&) () from /usr/lib64/libhsa-runtime64.so.1
No symbol table info available.
#11 0x00007ffff7df9087 in std::function<void (void*)>::operator()(void*) const () from /usr/lib64/libhsa-runtime64.so.1
No symbol table info available.
#12 0x00007ffff7e0e454 in amd::GpuAgent::ReleaseShader(void*, unsigned long) const () from /usr/lib64/libhsa-runtime64.so.1
No symbol table info available.
#13 0x00007ffff7e0d7cb in amd::GpuAgent::~GpuAgent() () from /usr/lib64/libhsa-runtime64.so.1
No symbol table info available.
#14 0x00007ffff7e0d960 in amd::GpuAgent::~GpuAgent() () from /usr/lib64/libhsa-runtime64.so.1
No symbol table info available.
#15 0x00007ffff7e76764 in void DeleteObject::operator()<core::Agent>(core::Agent const*) const () from /usr/lib64/libhsa-runtime64.so.1
No symbol table info available.
#16 0x00007ffff7e736a4 in DeleteObject std::for_each<__gnu_cxx::__normal_iterator<core::Agent**, std::vector<core::Agent*, std::allocator<core::Agent*> > >, DeleteObject>(__gnu_cxx::__normal_iterator<core::Agent**, std::vector<core::Agent*, std::allocator<core::Agent*> > >, __gnu_cxx::__normal_iterator<core::Agent**, std::vector<core::Agent*, std::allocator<core
::Agent*> > >, DeleteObject) () from /usr/lib64/libhsa-runtime64.so.1
No symbol table info available.
#17 0x00007ffff7e6dc83 in core::Runtime::Unload() () from /usr/lib64/libhsa-runtime64.so.1
No symbol table info available.
#18 0x00007ffff7e683a3 in core::Runtime::Release() () from /usr/lib64/libhsa-runtime64.so.1
No symbol table info available.
#19 0x00007ffff7e40452 in HSA::hsa_shut_down() () from /usr/lib64/libhsa-runtime64.so.1
No symbol table info available.
#20 0x00007ffff7e8af92 in hsa_shut_down () from /usr/lib64/libhsa-runtime64.so.1
No symbol table info available.
#21 0x000055555555c931 in main (argc=1, argv=0x7fffffffd9f8) at /tmp/portage/dev-util/rocminfo-3.5.0/work/rocminfo-rocm-3.5.0/rocminfo.cc:1167
err = HSA_STATUS_SUCCESS
sys_info = {major = 1, minor = 1, timestamp_frequency = 1000000000, max_wait = 18446744073709551615, endianness = HSA_ENDIANNESS_LITTLE, machine_model = HSA_MACHINE_MODEL_LARGE}
agent_ind = 3
Sadly debug symbols are missing, since ROCR-Runtime's build system seems to override CXXFLAGS and LDFLAGS when building shared libraries. c.f. https://bugs.gentoo.org/729898
This is reproducible every time I run rocminfo.
Regression
I never got ROCm to work on this system. Still working on it. :)
Logs
dmesg prints during execution of rocminfo:
[Sat Jul 11 22:49:59 2020] Alloc host visible vram on small bar is not allowed
[Sat Jul 11 22:49:59 2020] Evicting PASID 0x8021 queues
[Sat Jul 11 22:49:59 2020] Evicting PASID 0x8021 queues
Other information
I also see exceptions and segfaults in Clover and ROCm's OpenCL implementation when executing clinfo:
- https://gitlab.freedesktop.org/mesa/mesa/-/issues/3255
- https://github.com/RadeonOpenCompute/ROCm-CompilerSupport/issues/32
I assume rocminfo is the more low-level command, so I guess first getting that to work without problems might help debugging the OpenCL problems.