pyopencl icon indicating copy to clipboard operation
pyopencl copied to clipboard

unaligned numpy arrays

Open nachiket opened this issue 7 years ago • 3 comments

For FPGA execution of OpenCL kernels, the board expects 64-byte aligned host arrays. However, there seems to be no way to get numpy arrays to obey custom alignments (It seems the inbuilt ALIGNMENT length is 16 or some such). In most cases, this is harmless, but sometimes it cases the FPGA OpenCL kernel to stall/freeze. I'm wondering if there's some way to get PyOpenCL to align buffers on the host prior to transfer?

nachiket avatar Feb 13 '17 21:02 nachiket

The execution is stuck at futex(0x7ff4e623bb08, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, ffffffff

It seems similar to http://stackoverflow.com/questions/10306669/opencl-kernel-hangs-forever-unless-i-remove-parameters/42214687#42214687 and the thread at https://lists.tiker.net/pipermail/pyopencl/2012-April/001158.html. But no resolution seems to have been posted.

nachiket avatar Feb 13 '17 22:02 nachiket

To my mind, this is a mailing list/tech support issue more than a bug in PyOpenCL.

i'd suggest you use code like this to align your numpy arrays to start with. PyOpenCL can't really do much about the alignment of data that already exists (short of copying, which is almost definitely not what anybody wants).

There's also not much of an argument to support that PyOpenCL should do anything about this either, because (e.g.) clEnqueueWriteBuffer is specified to accept any pointer. An implementation that fails to do so is non-conforming.

inducer avatar Feb 13 '17 23:02 inducer

Found a simpler solution at http://numpy-discussion.10968.n7.nabble.com/Byte-aligned-arrays-td3887.html which fixes alignment, but is still a bit messy

def aligned_zeros(shape, boundary=64, dtype=float, order='C'):
    N = np.prod(shape)
    d = np.dtype(dtype)
    tmp = np.zeros(N * d.itemsize + boundary, dtype=np.uint8)
    address = tmp.__array_interface__['data'][0]
    offset = (boundary - address % boundary) % boundary
    return tmp[offset:offset+N*d.itemsize].view(dtype=d).reshape(shape, order=order)

nachiket avatar Feb 14 '17 16:02 nachiket