KataGo
KataGo copied to clipboard
Weird OpenCL error ....
Hello, sorry to bother you. Not sure if that is even Katago's problem or something else. Have been using kata under Linux for years using OpenCL with an AMD GPU (RX570). Recently I switched to a RX6770XT, and after some troubles managed to install the AMD drivers for it for OpenCL support. However, some strange things happen that didn't happen before.
Here's the output Kata gives when I try to tune it:
`~/katago$ ./katago tuner -model kata1-b40.bin.gz 2023-06-05 23:40:45+0200: Loading model... 2023-06-05 23:40:46+0200: Querying system devices... 2023-06-05 23:40:46+0200: Found OpenCL Platform 0: Clover (Mesa) (OpenCL 1.1 Mesa 22.2.5) 2023-06-05 23:40:46+0200: Found 1 device(s) on platform 0 with type CPU or GPU or Accelerator 2023-06-05 23:40:46+0200: Found OpenCL Platform 1: AMD Accelerated Parallel Processing (Advanced Micro Devices, Inc.) (OpenCL 2.1 AMD-APP (3513.0)) 2023-06-05 23:40:46+0200: Found 0 device(s) on platform 1 with type CPU or GPU or Accelerator, skipping 2023-06-05 23:40:46+0200: Found OpenCL Device 0: AMD Radeon RX 6700 XT (navi22, LLVM 15.0.6, DRM 3.48, 5.19.0-43-generic) (AMD) (score 11000101) 2023-06-05 23:40:46+0200: Tuner starting... 2023-06-05 23:40:46+0200: Creating context for OpenCL Platform: Clover (Mesa) (OpenCL 1.1 Mesa 22.2.5) 2023-06-05 23:40:46+0200: Using OpenCL Device 0: AMD Radeon RX 6700 XT (navi22, LLVM 15.0.6, DRM 3.48, 5.19.0-43-generic) (AMD) OpenCL 1.1 Mesa 22.2.5 (Extensions: cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp64 cl_khr_extended_versioning)
Tuning device 0: AMD Radeon RX 6700 XT (navi22, LLVM 15.0.6, DRM 3.48, 5.19.0-43-generic) Starting from existing parameters in: /home/werner/.katago/opencltuning/tune8_gpuAMDRadeonRX6700XTnavi22LLVM1506DRM348519043generic_x19_y19_c256_mv10.txt Beginning GPU tuning for AMD Radeon RX 6700 XT (navi22, LLVM 15.0.6, DRM 3.48, 5.19.0-43-generic) modelVersion 10 channels 256 Setting winograd3x3TileSize = 4
Tuning xGemmDirect for 1x1 convolutions and matrix mult Testing 56 different configs WARNING: Reference implementation failed: CL_BUILD_PROGRAM_FAILURE Tuning 20/56 ... Tuning 40/56 ... ERROR: Could not find any configuration that worked
Tuning xGemm for convolutions Testing 70 different configs WARNING: Reference implementation failed: CL_BUILD_PROGRAM_FAILURE Tuning 20/70 ... Tuning 40/70 ... Tuning 60/70 ... ERROR: Could not find any configuration that worked
Tuning hGemmWmma for convolutions Testing 146 different configs FP16 tensor core tuning failed, assuming no FP16 tensor core support
Tuning xGemm for convolutions - trying with FP16 storage Testing 70 different configs FP16 storage tuning failed, assuming no FP16 storage support
Using FP32 storage! Using FP32 compute!
Tuning winograd transform for convolutions Testing 47 different configs WARNING: Reference implementation failed: CL_BUILD_PROGRAM_FAILURE Tuning 20/47 ... Tuning 40/47 ... ERROR: Could not find any configuration that worked
Tuning winograd untransform for convolutions Testing 111 different configs WARNING: Reference implementation failed: CL_BUILD_PROGRAM_FAILURE Tuning 20/111 ... Tuning 40/111 ... Tuning 60/111 ... Tuning 80/111 ... Tuning 100/111 ... ERROR: Could not find any configuration that worked
Tuning global pooling strides Testing 106 different configs WARNING: Reference implementation failed: CL_BUILD_PROGRAM_FAILURE Tuning 20/106 ... Tuning 40/106 ... Tuning 60/106 ... Tuning 80/106 ... Tuning 100/106 ... ERROR: Could not find any configuration that worked Done tuning
Done, results saved to /home/werner/.katago/opencltuning/tune8_gpuAMDRadeonRX6700XTnavi22LLVM1506DRM348519043generic_x19_y19_c256_mv10.txt `
Never seen that error before. But I suspect it has something to do with a line that clinfo is giving me (right under "CL_PROGRAM_BUILD_LOG"):
`$ clinfo Number of platforms 2 Platform Name Clover Platform Vendor Mesa Platform Version OpenCL 1.1 Mesa 22.2.5 Platform Profile FULL_PROFILE Platform Extensions cl_khr_icd Platform Extensions function suffix MESA
Platform Name AMD Accelerated Parallel Processing Platform Vendor Advanced Micro Devices, Inc. Platform Version OpenCL 2.1 AMD-APP (3513.0) Platform Profile FULL_PROFILE Platform Extensions cl_khr_icd cl_amd_event_callback Platform Extensions function suffix AMD Platform Host timer resolution 1ns
Platform Name Clover
Number of devices 1
Device Name AMD Radeon RX 6700 XT (navi22, LLVM 15.0.6, DRM 3.48, 5.19.0-43-generic)
Device Vendor AMD
Device Vendor ID 0x1002
Device Version OpenCL 1.1 Mesa 22.2.5
Device Numeric Version 0x401000 (1.1.0)
Driver Version 22.2.5
Device OpenCL C Version OpenCL C 1.1
Device Type GPU
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Max compute units 40
Max clock frequency 2725MHz
Max work item dimensions 3
Max work item sizes 256x256x256
Max work group size 256
=== CL_PROGRAM_BUILD_LOG ===
fatal error: cannot open file '/usr/lib/clc/gfx1031-amdgcn-mesa-mesa3d.bc': No such file or directory
Preferred work group size multiple (kernel) <getWGsizes:1504: create kernel : error -46>
Preferred / native vector sizes
char 16 / 16
short 8 / 8
int 4 / 4
long 2 / 2
half 0 / 0 (n/a)
float 4 / 4
double 2 / 2 (cl_khr_fp64)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Address bits 64, Little-Endian
Global memory size 12884901888 (12GiB)
Error Correction support No
Max memory allocation 3221225472 (3GiB)
Unified memory for Host and Device No
Minimum alignment for any data type 128 bytes
Alignment of base address 32768 bits (4096 bytes)
Global Memory cache type None
Image support No
Local memory type Local
Local memory size 65536 (64KiB)
Max number of constant args 16
Max constant buffer size 67108864 (64MiB)
Max size of kernel argument 1024
Queue properties
Out-of-order execution No
Profiling Yes
Profiling timer resolution 0ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
ILs with version (n/a)
Built-in kernels with version (n/a)
Device Extensions cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp64 cl_khr_extended_versioning
Device Extensions with Version cl_khr_byte_addressable_store 0x400000 (1.0.0)
cl_khr_global_int32_base_atomics 0x400000 (1.0.0)
cl_khr_global_int32_extended_atomics 0x400000 (1.0.0)
cl_khr_local_int32_base_atomics 0x400000 (1.0.0)
cl_khr_local_int32_extended_atomics 0x400000 (1.0.0)
cl_khr_int64_base_atomics 0x400000 (1.0.0)
cl_khr_int64_extended_atomics 0x400000 (1.0.0)
cl_khr_fp64 0x400000 (1.0.0)
cl_khr_extended_versioning 0x400000 (1.0.0)
Platform Name AMD Accelerated Parallel Processing Number of devices 0
NULL platform behavior clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) No platform clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) No platform clCreateContext(NULL, ...) [default] No platform clCreateContext(NULL, ...) [other] Success [MESA] clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1) Platform Name Clover Device Name AMD Radeon RX 6700 XT (navi22, LLVM 15.0.6, DRM 3.48, 5.19.0-43-generic) clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1) Platform Name Clover Device Name AMD Radeon RX 6700 XT (navi22, LLVM 15.0.6, DRM 3.48, 5.19.0-43-generic) clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1) Platform Name Clover Device Name AMD Radeon RX 6700 XT (navi22, LLVM 15.0.6, DRM 3.48, 5.19.0-43-generic)`
Maybe you have a clue.
Yeah, you might be running into a problem with drivers. I see the word "Mesa" appear in your info - that "Mesa" drivers have been found to be buggy for general purpose OpenCL usage by users in the past. Does this thread help? https://bbs.archlinux.org/viewtopic.php?pid=1895516#p1895516
Thanks. Seems like it. The only reason I tried the mesa-opencl-icd lib is cause an admin (on the mint forum) suggested that. Without it, my clinfo output doesn't even see the GPU as opencl device... but anyway, that's not your problem. Just thought I ask here, maybe the kata output would be a clue. Frankly I'm out of ideas. It used to work pretty well. Not sure what caused the problems, maybe a switch to a newer GPU (but not really, old GPU has same problems), maybe switching to Mint 21 (based on Ubuntu 22.04).... just can't get it to work anymore and even skilled people like the admins and devs on the Mint forum can't help... hm.
Only thing you might know: do you yourself use or know of people using AMD GPUs like the RX 6700 XT, or something from that generation, successfully with Linux/Ubuntu/Mint and Katago? I mean, this has to work somehow somewhere...
use AMD-Rocm instead of mesa which is currently known to be broken