pyopencl
pyopencl copied to clipboard
Add __array__ method to pyopencl.Array
Hello! We are beginning to experiment with OpenCL over at scikit-image (see e.g. https://github.com/scikit-image/scikit-image/pull/4215). For us, it would be ideal np.asarray(pyopencl_array)
worked transparently. The __array__
method just calls get
, but enables easier compatibility within the NumPy ecosystem.
Here's the approach demonstrated with monkeypatching:
In [1]: import numpy as np
In [2]: from pyopencl.array import Array
In [3]: from gputools import OCLArray
In [4]: arr = OCLArray.from_array(np.random.random((32, 32)))
In [5]: type(arr)
Out[5]: pyopencl.array.Array
In [6]: np.asarray(arr)
Out[6]:
array([[array(0.2397247), array(0.77347383), array(0.96216983), ...,
array(0.5608753), array(0.30588483), array(0.52376195)],
[array(0.70506864), array(0.22409207), array(0.84294354), ...,
array(0.21627319), array(0.72259475), array(0.36756701)],
[array(0.37779019), array(0.23074081), array(0.31592551), ...,
array(0.45015637), array(0.08316402), array(0.15636401)],
...,
[array(0.30017419), array(0.39220295), array(0.0588779), ...,
array(0.49011532), array(0.6003594), array(0.78047825)],
[array(0.14872931), array(0.76511074), array(0.69126287), ...,
array(0.0361155), array(0.07902312), array(0.12513675)],
[array(0.80042382), array(0.16620888), array(0.28844463), ...,
array(0.065685), array(0.9465674), array(0.56786556)]],
dtype=object)
In [7]: Array.__array__ = Array.get
In [8]: np.asarray(arr)
Out[8]:
array([[0.2397247 , 0.77347383, 0.96216983, ..., 0.5608753 , 0.30588483,
0.52376195],
[0.70506864, 0.22409207, 0.84294354, ..., 0.21627319, 0.72259475,
0.36756701],
[0.37779019, 0.23074081, 0.31592551, ..., 0.45015637, 0.08316402,
0.15636401],
...,
[0.30017419, 0.39220295, 0.0588779 , ..., 0.49011532, 0.6003594 ,
0.78047825],
[0.14872931, 0.76511074, 0.69126287, ..., 0.0361155 , 0.07902312,
0.12513675],
[0.80042382, 0.16620888, 0.28844463, ..., 0.065685 , 0.9465674 ,
0.56786556]])
For more information, see: https://docs.scipy.org/doc/numpy/user/basics.dispatch.html#module-numpy.doc.dispatch https://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html
Thank you!
@inducer Yikes, looks like a bunch of stuff broken in CI that I hope is independent from this change and that probably takes priority over this. However, could I get some comment from you about whether this would be a welcome addition? Thanks!
CI says the failure is in the test you added:
_ test_array_method[<context factory for <pyopencl.Device 'pthread-Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz' on 'Portable Computing Language'>] _
Traceback (most recent call last):
File "/home/vsts/work/1/s/test/test_array.py", line 670, in test_array_method
arr = make_random_array(queue, np.float32, (32, 32))
File "/home/vsts/work/1/s/test/test_array.py", line 75, in make_random_array
return rand(queue, shape=(size,), dtype=dtype)
File "/home/vsts/work/1/s/.miniconda3/envs/testing/lib/python2.7/site-packages/pyopencl/clrandom.py", line 765, in rand
result = Array(queue, shape, dtype)
File "/home/vsts/work/1/s/.miniconda3/envs/testing/lib/python2.7/site-packages/pyopencl/array.py", line 500, in __init__
context, cl.mem_flags.READ_WRITE, alloc_nbytes)
TypeError: __init__(): incompatible constructor arguments. The following argument types are supported:
1. pyopencl._cl.Buffer(context: pyopencl._cl.Context, flags: int, size: int = 0, hostbuf: object = None)
Invoked with: <pyopencl.Context at 0x55cc47968910 on <pyopencl.Device 'pthread-Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz' on 'Portable Computing Language' at 0x55cc329c4fa0>>, 1, (32, 32, 32, 32, 32, 32, 32, 32)
(You can click through to "view details on Azure Pipelines" from the CI results)
I'm not sure I'm a fan of this, as Array.get()
can be quite expensive if you're not careful, especially if you end up sloshing data back and forth between CPU and GPU inadvertently. So I'd like to retain an obvious syntactic hint that what you're doing is not free.
I should ask, could you explain the specific need filled by having asarray
work? Is it just convenience or something deeper?
Ooops! Sorry, I somehow didn't find that line in the logs!
Re the utility of it: it's common and very useful to allow np.asarray
on array-likes to work as expected, as it allows unifying code in pipelines that, of necessity, mix OpenCL and NumPy arrays. In my case, I'm trying to allow dispatching certain operations in scikit-image to the GPU, if OpenCL is correctly configured. Due to the prevalence of scikit-image, this must be optional. This means that I will end up peppering a lot of code with .get()
s that could otherwise be totally transparent. In this code, for example, a simple shim around gputools.affine
could allow me get rid of that entire if clause:
https://github.com/jni/scikit-image/commit/0c6fb69974278dca696a0cb5409a41952c830d2c#diff-28412ceac9e3406ba1842d7ba3938e44R177-R183
Yes, I understand that there are performance considerations, but a certain amount of responsibility should fall on library creators. In this case, because not everything I need is implemented in gputools, some operations have to happen in NumPy, and it'd be great if I could just feed in my pyopencl arrays transparently.
This whole endeavour is reasonably contentious, so the smaller I can keep the impact to the code, the better...
Thanks!
- Your code is still failing the Flake8 linter, see CI. FWIW, it is expected that your code passes CI before it gets merged.
- What will this do for
numpy_array + cl_array
? - Will
np.asarray
make another copy? Or just return what__array__
returns?
@inducer Apologies for the churn!
Your code is still failing the Flake8 linter, see CI.
Fixed.
FWIW, it is expected that your code passes CI before it gets merged.
Of course. Thanks for the reminder. =)
The Travis failures appear to be NumPy compilation failures, which I think are unrelated? (That's what confused me with the first commit.)
What will this do for
numpy_array + cl_array
?
This currently fails with a ValueError in PyOpenCL:
In [6]: np.random.random((32, 32)) + arr
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-6-80631e85a40f> in <module>
----> 1 np.random.random((32, 32)) + arr
~/miniconda3/envs/cf/lib/python3.6/site-packages/pyopencl/array.py in __add__(self, other)
920 else:
921 # add a scalar
--> 922 if other == 0:
923 return self.copy()
924 else:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
This failure is the same with and without this PR. More work needs to be done to support this, which I'd be interested in pursuing but probably belongs in a new PR. I think the correct way to get automatic coercion here is to raise NotImplemented
, but I would have to verify this.
Will np.asarray make another copy?
Only if a specific dtype is requested that is different from the one provided by __array__
. From the np.asarray
docstring:
No copy is performed if the input is already an ndarray with matching dtype and order. If a is a subclass of ndarray, a base class ndarray is returned.
:+1: to this change... It would make dispatching to pyopencl arrays much easier.
@jni if you think that the CI issue is with numpy compilation, did you try rebasing on master and see if it disappears?
@inducer, @emmanuelle's suggestion worked! Would you mind giving this another round of review? =) Thank you!
@inducer do you have any lingering doubts about this? If it doesn't fit with your vision for pyopencl, please let me know, as I can potentially get this done from our end with subclassing or monkey-patching. I just prefer to contribute to the libraries I find useful — but I don't mean to presume what should go in!
Thanks again!