pycuda
pycuda copied to clipboard
```__neg__``` failing for empty arrays
Here's the MWE
>>> import pycuda.autoinit
>>> import pycuda.gpuarray as gpuarray
>>> import numpy as np
>>> empty_array = np.array([])
>>> empty_array_gpu = gpuarray.to_gpu(empty_array)
>>> neg_empty_array = -empty_array # array([], dtype=float64)
>>> neg_empty_array_gpu = -empty_array_gpu # Fails
Here's the error trace
---------------------------------------------------------------------------
error Traceback (most recent call last)
Input In [17], in <cell line: 1>()
----> 1 -gpuarray.to_gpu(a)
File ~/pycuda/pycuda/gpuarray.py:643, in GPUArray.__neg__(self)
641 def __neg__(self):
642 result = self._new_like_me()
--> 643 return self._axpbz(-1, 0, result)
File ~/pycuda/pycuda/gpuarray.py:468, in GPUArray._axpbz(self, selffac, other, out, stream)
463 raise RuntimeError(
464 "only contiguous arrays may " "be used as arguments to this operation"
465 )
467 func = elementwise.get_axpbz_kernel(self.dtype, out.dtype)
--> 468 func.prepared_async_call(
469 self._grid,
470 self._block,
471 stream,
472 selffac,
473 self.gpudata,
474 other,
475 out.gpudata,
476 self.mem_size,
477 )
479 return out
File ~/pycuda/pycuda/driver.py:626, in _add_functionality.<locals>.function_prepared_async_call(func, grid, block, stream, *args, **kwargs)
620 raise TypeError(
621 "unknown keyword arguments: " + ", ".join(kwargs.keys())
622 )
624 from pycuda._pvt_struct import pack
--> 626 arg_buf = pack(func.arg_format, *args)
628 for texref in func.texrefs:
629 func.param_set_texref(texref)
error: required argument is not an integer
I see the following two approaches:
- Rewrite the kernel launches in
pycuda.gpuarray
to guard from passingNone
as an argument, or, - Modify
Function.prepared_call
to acceptNone
as a valid argument.
Taking (1) would lead to an uglier implementation of GPUArray
and taking (2) might lead to user mishaps that could be reflected as segfaults.
Maybe we could introduce another kw-only argument to prepared_call
called accept_none
which is False by default and if True, either doesn't launch the kernel or maps such arguments to 0.
@inducer: Any comments?
If you can do 2 without introducing per-argument processing in Python (such as by modifying the custom struct packing), that could be viable.