opencl-intercept-layer icon indicating copy to clipboard operation
opencl-intercept-layer copied to clipboard

Add functionality to automatically detect NaNs in buffers/images before/after kernel enqueue

Open Novermars opened this issue 2 years ago • 0 comments

Implements #305

Description of Changes

Added a control DetectNaNs (bool), when set to true, it checks all images/buffers which are of floating point type after and before an enqueue to check for NaNs.

Testing Done

Tested with a buffer example on Windows on Intel iGPU. Further testing WIP.

import numpy as np
import pyopencl as cl

a_np = np.zeros(8).astype(np.float32)
a_np[:] = 0.1

b_np = np.zeros(8).astype(np.int32)
b_np[:] = 10

ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)

mf = cl.mem_flags
a_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a_np)
b_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=b_np)

flags = []
prg1 = cl.Program(ctx, """
__kernel void sum(__global float *res_g)
{
  int gid = get_global_id(0);
  res_g[gid] = 0.0f; 
}
""").build(flags)

prg2 = cl.Program(ctx, """
__kernel void insertNaN(__global float *res_g)
{
  int gid = get_global_id(0);
  res_g[gid] = NAN;
}
""").build(flags)

prg3 = cl.Program(ctx, """
__kernel void makeInt(__global int *res_g)
{
  int gid = get_global_id(0);
  res_g[gid] = 10;
}
""").build(flags)

res_g = cl.Buffer(ctx, mf.READ_WRITE, a_np.nbytes)
knl1 = prg1.sum  # Use this Kernel object for repeated calls
knl1(queue, a_np.shape, None, res_g)
knl1(queue, a_np.shape, None, res_g)
knl1(queue, a_np.shape, None, res_g)

knl2 = prg2.insertNaN
knl2(queue, a_np.shape, None, res_g)

knl1(queue, a_np.shape, None, res_g)
knl1(queue, a_np.shape, None, res_g)
knl3 = prg3.makeInt
knl3(queue, b_np.shape, None, res_g)
knl1(queue, a_np.shape, None, res_g)

knl2(queue, a_np.shape, None, res_g)

knl1(queue, a_np.shape, None, res_g)

Gives:

After kernel: insertNaN, EnqueueCtr: 3, arg_index: 0, data type: float*, has a NaN. Before kernel: sum, EnqueueCtr: 4, arg_index: 0, data type: float*, has a NaN. After kernel: insertNaN, EnqueueCtr: 8, arg_index: 0, data type: float*, has a NaN. Before kernel: sum, EnqueueCtr: 9, arg_index: 0, data type: float*, has a NaN.

Novermars avatar Apr 11 '23 20:04 Novermars