kernel oops for gfx1201 on Fedora Rawhide
Run this fedora container on a fedora rawhide host. https://github.com/trixirt/rocm-distro-containers/blob/main/fedora/rawhide/rocblas/check/Dockerfile
with the args docker run --device /dev/kfd --device /dev/dri -it --rm --cpus=1
This produces a backtrace ... [Detaching after vfork from child process 477] Memory access fault by GPU node-1 (Agent handle: 0x55ad5e2a1be0) on address 0x7f7c0123e000. Reason: Page not present or supervisor privilege. GPU core dump created: gpucore.16
Thread 15 "rocblas-test" received signal SIGABRT, Aborted.
[Switching to Thread 0x7f7c00fff6c0 (LWP 32)]
__pthread_kill_implementation (threadid=
And an oops on the host From dmesg
[ 702.395728] gmc_v12_0_process_interrupt: 94 callbacks suppressed [ 702.395732] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:173 vmid:8 pasid:32771) [ 702.395737] amdgpu 0000:03:00.0: amdgpu: in process rocblas-test pid 3814 thread rocblas-test pid 3814) [ 702.395738] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x00007f7c0123e000 from client 10 [ 702.395740] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x0084115B [ 702.395741] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) [ 702.395742] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1 [ 702.395743] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x5 [ 702.395743] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x5 [ 702.395744] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x1 [ 702.395745] amdgpu 0000:03:00.0: amdgpu: RW: 0x1 [ 702.395752] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:173 vmid:8 pasid:32771) [ 702.395754] amdgpu 0000:03:00.0: amdgpu: in process rocblas-test pid 3814 thread rocblas-test pid 3814) [ 702.395755] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x00007f7c0123e000 from client 10 [ 702.395762] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:173 vmid:8 pasid:32771) [ 702.395763] amdgpu 0000:03:00.0: amdgpu: in process rocblas-test pid 3814 thread rocblas-test pid 3814) [ 702.395764] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x00007f7c0123e000 from client 10 [ 702.395772] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:173 vmid:8 pasid:32771) [ 702.395773] amdgpu 0000:03:00.0: amdgpu: in process rocblas-test pid 3814 thread rocblas-test pid 3814) [ 702.395774] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x00007f7c0123e000 from client 10 [ 702.395781] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:173 vmid:8 pasid:32771) [ 702.395782] amdgpu 0000:03:00.0: amdgpu: in process rocblas-test pid 3814 thread rocblas-test pid 3814) [ 702.395783] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x00007f7c0123e000 from client 10 [ 702.395790] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:173 vmid:8 pasid:32771) [ 702.395791] amdgpu 0000:03:00.0: amdgpu: in process rocblas-test pid 3814 thread rocblas-test pid 3814) [ 702.395792] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x00007f7c0123e000 from client 10
The running kernel $ uname -a Linux fedora 6.16.0-0.rc0.250605gec7714e494790.13.fc43.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Jun 6 09:52:12 UTC 2025 x86_64 GNU/Linux
rocminfo for the card ISA Info:
Agent 2
Name: gfx1201
Uuid: GPU-7b2a57bc7a036a5f
Marketing Name: AMD Radeon Graphics
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 32(0x20) KB
L2: 8192(0x2000) KB
L3: 65536(0x10000) KB
Chip ID: 29807(0x746f)
ASIC Revision: 1(0x1)
Cacheline Size: 256(0x100)
Max Clock Freq. (MHz): 2420
BDFID: 768
Internal Node ID: 1
Compute Unit: 64
SIMDs per CU: 2
Shader Engines: 4
Shader Arrs. per Eng.: 2
WatchPts on Addr. Ranges:4
Coherent Host Access: FALSE
Memory Properties:
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 32(0x20)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 32(0x20)
Max Work-item Per CU: 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Packet Processor uCode:: 1012
SDMA engine uCode:: 838
IOMMU Support:: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 16695296(0xfec000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Recommended Granule:0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx1201
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
ISA 2
Name: amdgcn-amd-amdhsa--gfx12-generic
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32