Nuno Fachada
Nuno Fachada
Wrap the clGetKernelSubGroupInfo API call (OpenCL 2.1), which queries kernels concerning sub-groups present in each work-group for a given local work size.
Wrap the clCreateProgramWithIL API call (OpenCL 2.1) which creates a program object using an IL (intermediate language, e.g. SPIR-V) memory block.
Wrap the clSetDefaultDeviceCommandQueue API call (OpenCL 2.1), which replaces the default command queue on the device.
Wrap the clCloneKernel API call (OpenCL 2.1), which makes a shallow copy of the kernel object.
OpenCL 2.0 and higher do not require that global work size be a multiple of the local work size. Reflect this in the [ccl_kernel_suggest_worksizes](https://github.com/fakenmc/cf4ocl/blob/master/src/lib/ccl_kernel_wrapper.c#L903) function.
Allow developers to register a callback function invoked on [g_error](https://github.com/fakenmc/g_err_macros) macro invocations stack traces
Improve test coverage reported by [Codecov](https://codecov.io/gh/fakenmc/cf4ocl). This will be an ongoing open issue. Comments will reflect coverage improvements.
This would allow developers to use and reference/dereference raw OpenCL objects outside of cf4ocl with increased safety.
By composite events I mean user defined virtual events that start with one OpenCL event (e.g. enqueue_map.start) and stop with another (e.g. enqueue_unmap.stop).