opencl-intercept-layer
opencl-intercept-layer copied to clipboard
Add functionality to automatically detect NaNs in buffers/images before/after kernel enqueue
Implements #305
Description of Changes
Added a control DetectNaNs (bool), when set to true, it checks all images/buffers which are of floating point type after and before an enqueue to check for NaNs.
Testing Done
Tested with a buffer example on Windows on Intel iGPU. Further testing WIP.
import numpy as np
import pyopencl as cl
a_np = np.zeros(8).astype(np.float32)
a_np[:] = 0.1
b_np = np.zeros(8).astype(np.int32)
b_np[:] = 10
ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)
mf = cl.mem_flags
a_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a_np)
b_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=b_np)
flags = []
prg1 = cl.Program(ctx, """
__kernel void sum(__global float *res_g)
{
int gid = get_global_id(0);
res_g[gid] = 0.0f;
}
""").build(flags)
prg2 = cl.Program(ctx, """
__kernel void insertNaN(__global float *res_g)
{
int gid = get_global_id(0);
res_g[gid] = NAN;
}
""").build(flags)
prg3 = cl.Program(ctx, """
__kernel void makeInt(__global int *res_g)
{
int gid = get_global_id(0);
res_g[gid] = 10;
}
""").build(flags)
res_g = cl.Buffer(ctx, mf.READ_WRITE, a_np.nbytes)
knl1 = prg1.sum # Use this Kernel object for repeated calls
knl1(queue, a_np.shape, None, res_g)
knl1(queue, a_np.shape, None, res_g)
knl1(queue, a_np.shape, None, res_g)
knl2 = prg2.insertNaN
knl2(queue, a_np.shape, None, res_g)
knl1(queue, a_np.shape, None, res_g)
knl1(queue, a_np.shape, None, res_g)
knl3 = prg3.makeInt
knl3(queue, b_np.shape, None, res_g)
knl1(queue, a_np.shape, None, res_g)
knl2(queue, a_np.shape, None, res_g)
knl1(queue, a_np.shape, None, res_g)
Gives:
After kernel: insertNaN, EnqueueCtr: 3, arg_index: 0, data type: float*, has a NaN. Before kernel: sum, EnqueueCtr: 4, arg_index: 0, data type: float*, has a NaN. After kernel: insertNaN, EnqueueCtr: 8, arg_index: 0, data type: float*, has a NaN. Before kernel: sum, EnqueueCtr: 9, arg_index: 0, data type: float*, has a NaN.