rocBLAS icon indicating copy to clipboard operation
rocBLAS copied to clipboard

Invalid instruction oops for gfx1151 on Fedora rawhide

Open trixirt opened this issue 7 months ago • 0 comments

Run this fedora container on a fedora rawhide host. https://github.com/trixirt/rocm-distro-containers/blob/main/fedora/rawhide/rocblas/check/Dockerfile

with the args docker run --device /dev/kfd --device /dev/dri -it --rm --cpus=1

This produces a backtrace

[----------] 498 tests from _/hemm_batched :0:rocdevice.cpp :2993: 23233334547 us: Callback: Queue 0x7fc1a8200000 aborting with error : HSA_STATUS_ERROR_INVALID_ISA: The instruction set architecture is invalid. code: 0x100f

Thread 22 "rocblas-test" received signal SIGABRT, Aborted. [Switching to Thread 0x7fc2bc9fe6c0 (LWP 39)] __pthread_kill_implementation (threadid=, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0; Missing rpms, try: dnf --enablerepo='debug' install blas-debuginfo-3.12.0-8.fc42.x86_64 (gdb) bt #0 __pthread_kill_implementation (threadid=, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 #1 0x00007fc2c7f24cf3 in __pthread_kill_internal (threadid=, signo=6) at pthread_kill.c:89 #2 0x00007fc2c7ecaabe in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #3 0x00007fc2c7eb26d0 in __GI_abort () at abort.c:73 #4 0x00007fc2c87fc999 in amd::roc::callbackQueue (status=, queue=0x7fc2bed9a000, data=0x56082b41eeb0) at /usr/src/debug/rocclr-6.4.1-1.fc43.x86_64/rocclr/device/rocm/rocdevice.cpp:2995 #5 0x00007fc2c7ad4cbf in rocr::AMD::callback_t<void ()(hsa_status_t, hsa_queue_s, void*)>::operator() (this=0x7fc19d002390, args#0=HSA_STATUS_ERROR_INVALID_ISA, args#1=, args#2=) at /usr/src/debug/rocm-runtime-6.4.1-1.fc43.x86_64/runtime/hsa-runtime/core/inc/exceptions.h:86 #6 rocr::AMD::AqlQueue::ExceptionHandler (error_code=, arg=0x7fc19d002240) at /usr/src/debug/rocm-runtime-6.4.1-1.fc43.x86_64/runtime/hsa-runtime/core/runtime/amd_aql_queue.cpp:1446 #7 0x00007fc2c7b1ca9b in operator() (__closure=, index=4, value=, wait_any=) at /usr/src/debug/rocm-runtime-6.4.1-1.fc43.x86_64/runtime/hsa-runtime/core/runtime/runtime.cpp:1551 #8 rocr::core::Runtime::AsyncEventsLoop (_eventsInfo=0x56082b3db738) at /usr/src/debug/rocm-runtime-6.4.1-1.fc43.x86_64/runtime/hsa-runtime/core/runtime/runtime.cpp:1666 #9 0x00007fc2c7aa6981 in rocr::os::ThreadTrampoline (arg=) at /usr/src/debug/rocm-runtime-6.4.1-1.fc43.x86_64/runtime/hsa-runtime/core/util/lnx/os_linux.cpp:86 #10 0x00007fc2c7f22cc4 in start_thread (arg=) at pthread_create.c:448 #11 0x00007fc2c7fa5494 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100

The host dmesg

[12315.231058] eth0: renamed from vethabf99aa [12315.231807] docker0: port 1(veth9c14e2b) entered blocking state [12315.231813] docker0: port 1(veth9c14e2b) entered forwarding state THIS -->> [23233.138774] [drm:gfx_v11_0_bad_op_irq [amdgpu]] ERROR Illegal opcode in command stream [24712.133781] perf: interrupt took too long (2508 > 2500), lowering kernel.perf_event_max_sample_rate to 79000 [34797.640063] perf: interrupt took too long (3190 > 3135), lowering kernel.perf_event_max_sample_rate to 62000 [60035.048325] perf: interrupt took too long (3996 > 3987), lowering kernel.perf_event_max_sample_rate to 50000 [62748.238298] docker0: port 1(veth9c14e2b) entered disabled state

The host kernel version

uname -r

6.16.0-0.rc0.250605gec7714e494790.13.fc43.x86_64

rocminfo ISA Info:


Agent 2


Name: gfx1151
Uuid: GPU-XX
Marketing Name: AMD Radeon Graphics
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 32(0x20) KB
L2: 2048(0x800) KB
L3: 32768(0x8000) KB
Chip ID: 5510(0x1586)
ASIC Revision: 0(0x0)
Cacheline Size: 128(0x80)
Max Clock Freq. (MHz): 2799
BDFID: 50176
Internal Node ID: 1
Compute Unit: 32
SIMDs per CU: 2
Shader Engines: 2
Shader Arrs. per Eng.: 2
WatchPts on Addr. Ranges:4
Coherent Host Access: FALSE
Memory Properties: APU Features: KERNEL_DISPATCH Fast F16 Operation: TRUE
Wavefront Size: 32(0x20)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension: x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 32(0x20)
Max Work-item Per CU: 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension: x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Packet Processor uCode:: 29
SDMA engine uCode:: 14
IOMMU Support:: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 32542876(0x1f0909c) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED Size: 32542876(0x1f0909c) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 3
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Recommended Granule:0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx1151
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension: x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension: x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
ISA 2
Name: amdgcn-amd-amdhsa--gfx11-generic
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension: x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension: x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32

trixirt avatar Jun 12 '25 12:06 trixirt