pykokkos icon indicating copy to clipboard operation
pykokkos copied to clipboard

ENH, MAINT: how to handle other array-like for ufuncs?

Open tylerjereddy opened this issue 3 years ago • 6 comments

NumPy ufuncs can generally accept array-like objects such as pure Python lists and even scalars. To provide a similar API in pykokkos, we'd need a way to coerce the input, perhaps via np.asarray() or similar, though some thought may be needed for cases where we want to send the data directly to pykokkos without needing to path through NumPy.

I noticed this when trying to replace np.cumsum() with pk.cumsum() in the SciPy source file scipy/spatial/_spherical_voronoi.py, with a traceback like this:

scipy/spatial/tests/test_spherical_voronoi.py:325: in test_equal_area_reconstitution
    areas = sv.calculate_areas()
        dim        = 3
        n          = 12
        points     = array([[ 0.        , -0.52573111, -0.85065081],
       [ 0.        , -0.52573111,  0.85065081],
       [ 0.        ,  ...5065081,  0.        ],
       [ 0.52573111, -0.85065081,  0.        ],
       [ 0.52573111,  0.85065081,  0.        ]])
        poly       = 'icosahedron'
        self       = <scipy.spatial.tests.test_spherical_voronoi.TestSphericalVoronoi object at 0x7f686edc6320>
        sv         = <scipy.spatial._spherical_voronoi.SphericalVoronoi object at 0x7f686cc6c610>
scipy/spatial/_spherical_voronoi.py:340: in calculate_areas
    return self._calculate_areas_3d()
        self       = <scipy.spatial._spherical_voronoi.SphericalVoronoi object at 0x7f686cc6c610>
scipy/spatial/_spherical_voronoi.py:266: in _calculate_areas_3d
    csizes = pk.cumsum(sizes)
        self       = <scipy.spatial._spherical_voronoi.SphericalVoronoi object at 0x7f686cc6c610>
        sizes      = [5, 5, 5, 5, 5, 5, ...]
../../../../../pykokkos/pykokkos/lib/ufuncs.py:421: in cumsum
    range_policy = pk.RangePolicy(pk.ExecutionSpace.Default, 0, view.shape[0])
E   AttributeError: 'list' object has no attribute 'shape'
        arr_type   = 'kokkos'
        view       = [5, 5, 5, 5, 5, 5, ...]

tylerjereddy avatar Jul 20 '22 21:07 tylerjereddy

You just need to wrap the list into a object which satisfies the buffer protocol: https://docs.python.org/3/c-api/buffer.html.

If you pass the Kokkos view via pykokkos-base through this it will work, even though it isn't a numpy array, bc the Kokkos view wrappers satisfy the buffer protocol, defined by Python, not numpy.

jrmadsen avatar Jul 20 '22 23:07 jrmadsen

So if you give a pykokkos ufunc a list, is it going to return a NumPy array or a pykokkos view? If the latter then we can't serve as a drop-in replacement for NumPy without some kind of i.e., argument to specify a return type, and most code out there will assume NumPy array since that's what NumPy ufuncs return when given array-like.

We could minimize disruption by making pykokkos views less different from NumPy arrays so that many calls/methods are supported, but at the moment we're not there yet.

I don't know if the Array API standard might be helpful here: https://data-apis.org/array-api/latest/purpose_and_scope.html

That might be a bit rigid for early dev work.

tylerjereddy avatar Jul 21 '22 23:07 tylerjereddy

If you look at the README of pykokkos-base, you will see:

import kokkos
import numpy as np

view = kokkos.array([2, 2], dtype=kokkos.double, space=kokkos.CudaUVMSpace,
                    layout=kokkos.LayoutRight, trait=kokkos.RandomAccess,
                    dynamic=False)

arr = np.array(view, copy=False)
print(type(arr).__name__) # this will print numpy.ndarray

arr is a numpy array. Kokkos Views are a drop in replacement for NumPy arrays because both conform to buffer protocol so you can convert between them with zero data copying, if need be, but most of the time you don't even need to convert them because functions "expecting numpy arrays" are really just expecting an object satisfying the buffer protocol... meaning it has metadata such as the shape, the dimensions, the stride, etc. so it can figure out iterate over the data.

So if you give a pykokkos ufunc a list, is it going to return a NumPy array or a pykokkos view?

If you want it to return the same type as was passed in: convert the list to a view, apply the ufunc to the view, convert the modified view back to a list. Really rough sketch:

def apply_ufunc(data, ...):

    convert_return = lambda x : x
    if isinstance(data, list):
        convert_return = lambda x : list(x) # i forget the exact way you convert a view/array to a list
        data = kokkos.array(data, ...)

   ... apply ufunc on view ...

   return convert_return(data)

jrmadsen avatar Jul 22 '22 00:07 jrmadsen

In fact, I think I've seen in the numpy source that they simply wrote a decorator around all their ufuncs to handle the conversion, e.g.

@apply_ufunc
def sum(...):
    return _sum(...)

Where _sum is the real implementation expecting a protocol buffer and the decorator handles the conversion before and after calling the real implementation.

jrmadsen avatar Jul 22 '22 00:07 jrmadsen

Kokkos views don't support all NumPy array methods yet, so there is a drop-in behavior difference. In fact, I had to implement size attribute just a few weeks ago. We could pass a list back, though NumPy doesn't so that would also be different when replacing.

tylerjereddy avatar Jul 22 '22 00:07 tylerjereddy

So maybe we just try to make sure that Kokkos views match the NumPy array methods within reason then. gh-38 was obviously pretty easy to add. If we're missing anything else from the Array API specification we may want to add it: https://data-apis.org/array-api/2021.12/API_specification/index.html (I see that size is indeed listed there).

It looks like we might also be able to measure conformance using this eventually, assuming we care to go that far: https://data-apis.org/array-api/2021.12/verification_test_suite.html

Your example usage of copy=False also reminds me that copy=False currently carries no guarantee that there is no copy, it just tries not to copy. There's discussion on changing that though in https://github.com/data-apis/array-api/issues/465. Maybe we can keep an eye on that discussion.

tylerjereddy avatar Jul 22 '22 02:07 tylerjereddy