noma
noma
The kernel works well for me locally, so thanks for writing it. However, I have trouble getting it running remotely via: [remote_ikernel](https://pypi.org/project/remote_ikernel/). It is unclear to me what the kernel...
Add support to allocate buffers using huge pages. Note to self: this is work in progress started already in an uncommitted branch.
I looked at the generated assembler code for auto vectorisation across work-items, and for manual vectorisation using 4 and 8 width vector types. file kernel_auto.cl ``` __kernel void benchmark_op( __global...
The following line: https://github.com/Computing-Language-Utility/CLU/blob/master/clu_runtime/clu_runtime.cpp#L1333 makes the library incompatible with OpenCL 1.1, which basically means it doesn't work with Nvidia's OpenCL implementation shipped with the CUDA SDK.