mixbench error with 7a068df

OS: Windows 11 Compiler: Visual Studio 2022 MSVC OpenCL SDK: KhronosGroup OpenCL SDK(https://github.com/KhronosGroup/OpenCL-Guide/blob/main/chapters/getting_started_windows.md)

mixbench-ocl

mixbench-ocl () Use "-h" argument to see available options ------------------------ Device specifications ------------------------ Platform: NVIDIA CUDA Device: NVIDIA GeForce RTX 4080/NVIDIA Corporation Driver version: 526.98 Address bits: 64 GPU clock rate: 2505 MHz Total global mem: 16375 MB Max allowed buffer: 4093 MB OpenCL version: OpenCL 3.0 CUDA Total CUs: 76

Buffer size: 256MB Workgroup size: 256 Elements per workitem: 8 Workitem fusion degree: 4 Workitem stride: NDRange Buffer allocation: Device allocated Timer: CL event based Warning: Half precision computations are not supported Loading kernel source file... Precompilation of kernels... OpenCL error in file 'G:\git\mixbench\mixbench-opencl\mix_kernels_ocl.cpp' in line 89 : Code -30.

Nov 26 '22 06:11 edisonchan

Thank you for reporting this. This refers to OpenCL kernel code compilation error (CL_INVALID_VALUE: -30) but it is not clear what bugs it.

Do other opencl programs run correctly? e.g. https://github.com/krrishnarraj/clpeak

Nov 26 '22 22:11 ekondis

Thank you for reporting this. This refers to OpenCL kernel code compilation error (CL_INVALID_VALUE: -30) but it is not clear what bugs it.

Do other opencl programs run correctly? e.g. https://github.com/krrishnarraj/clpeak

clpeak is ok here(build and run):

Platform: NVIDIA CUDA
  Device: NVIDIA GeForce RTX 4080
    Driver version  : 526.98 (Win64)
    Compute units   : 76
    Clock frequency : 2505 MHz

    Global memory bandwidth (GBPS)
      float   : 612.28
      float2  : 631.80
      float4  : 639.96
      float8  : 648.81
      float16 : 656.37

    Single-precision compute (GFLOPS)
      float   : 52304.35
      float2  : 51823.82
      float4  : 52095.66
      float8  : 51354.73
      float16 : 51322.97

    No half precision support! Skipped

    Double-precision compute (GFLOPS)
      double   : 853.48
      double2  : 852.69
      double4  : 850.52
      double8  : 846.52
      double16 : 838.56

    Integer compute (GIOPS)
      int   : 26660.84
      int2  : 26533.69
      int4  : 26473.44
      int8  : 26544.63
      int16 : 26350.34

    Integer compute Fast 24bit (GIOPS)
      int   : 26459.70
      int2  : 26463.14
      int4  : 26457.42
      int8  : 26354.03
      int16 : 25947.06

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 15.07
      enqueueReadBuffer               : 13.99
      enqueueWriteBuffer non-blocking : 15.06
      enqueueReadBuffer non-blocking  : 14.00
      enqueueMapBuffer(for read)      : 21.76
        memcpy from mapped ptr        : 22.84
      enqueueUnmap(after write)       : 26.33
        memcpy to mapped ptr          : 22.43

    Kernel launch latency : 8.61 us

There is not problem mixbench 0.04 too.

Nov 27 '22 03:11 edisonchan