dask-glm
dask-glm copied to clipboard
NEP-18: CuPy backend requires unimplemented algorithms
For the past few weeks, I've been working on issues related to NEP-18 support for Dask (and ultimately, Dask-GLM as well) to allow CuPy to be used as a backend. I've made some progress, which I'll soon share with more details.
Among the issues I have now is the usage of the following Dask-GLM algorithms with CuPy:
-
newton
: requires numpy.linalg.lstsq -
admm
andlbfgs
: require scipy.optimize.fmin_l_bfgs_b
Both of these functions are not implemented in CuPy. So I'm wondering if anybody knows whether any of these functions have some open and reliable CUDA implementation that could be integrated into CuPy or if we could implement them somewhat easily on top of already supported CuPy functionality?
cc @cicdw
My guess is that in both cases the resulting array (the input of lstsq
or the gradient returned by compute_loss_grad
) is probably very small, and so one approach would be to just always convert it into a NumPy array. My guess is that from a performance perspective this would have very little impact and would be easy to accomplish.
The function dask.array.utils.normalize_to_array
was intended for this purpose (though it currently supports only the cupy
case. We might consider renaming it to normalize_to_numpy
.
I'm not familiar with the CuPy / CUDA ecosystem, but in terms of implementing the two methods you mention: I'd suspect that some form of lstsq
would be doable, but fmin_l_bfgs_b
would probably be incredibly difficult.
So my proposal above isn't that we implement lstsq
and fmin_l_bfgs_b
in CUDA, it's that we just always coerce the necessary arrays to NumPy. My understanding is that the inputs to lstsq
, and the outputs of compute_loss_grad
are on the order of the number of parameters, which is typically under 100 or so and so is probably better suited to the CPU anyway. Is my understanding correct here @cicdw ?
Ah I see that now; yea that's all correct.
The idea to use normalize_to_array()
sounds good to me. I'm now wondering about the case where we need to convert these outputs back to their original type arrays. We can use *_like()
but only for predefined fillings, do we have any ideas on doing the reverse normalize_to_array()
operation? I thought maybe introducing a copy_like()
function in NumPy could be useful for that, but perhaps there's already a better alternative that I don't know about.
I think that it's probably OK if we always return numpy arrays. These outputs are likely to be small.
It's not ok in here: https://github.com/dask/dask-glm/blob/master/dask_glm/algorithms.py#L347
We pass beta0
, X
and y
to scipy.optimize.fmin_l_bfgs_b
. beta0
gets has to be copied back to CPU because it gets modified here: https://github.com/scipy/scipy/blob/master/scipy/optimize/lbfgsb.py#L326
Later on, x
(formerly beta0
) is passed here: https://github.com/scipy/scipy/blob/master/scipy/optimize/lbfgsb.py#L335, which is actually wrapping https://github.com/dask/dask-glm/blob/master/dask_glm/algorithms.py#L340. So here, X
and y
remain on GPU, but beta
is on CPU. And this loop happens several times. If we copy X
and y
back to CPU, we beat the purpose of using CuPy in the first place.
Ah, my hope was that when X, y, and beta interacted that the GPU arrays would pull the CPU array to the GPU, rather than the other way around. One sec, I'm going to play with this briefly.
So I was hoping that things like the following would work
In [1]: import numpy as np
In [2]: import cupy
In [3]: x = np.arange(100)
In [4]: y = cupy.arange(100)
In [5]: z = x + y
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-5-8c8f78e0676d> in <module>
----> 1 z = x + y
~/cupy/cupy/core/core.pyx in cupy.core.core.ndarray.__array_ufunc__()
1734 for x in inputs
1735 ])
-> 1736 return cp_ufunc(*inputs, **kwargs)
1737 # Don't use for now, interface uncertain
1738 # elif method =='at' and name == 'add':
~/cupy/cupy/core/_kernel.pyx in cupy.core._kernel.ufunc.__call__()
790 raise TypeError('Wrong number of arguments for %s' % self.name)
791
--> 792 args = _preprocess_args(args)
793 if out is None:
794 in_args = args[:self.nin]
~/cupy/cupy/core/_kernel.pyx in cupy.core._kernel._preprocess_args()
84 arg = _scalar.convert_scalar(arg, use_c_scalar)
85 if arg is None:
---> 86 raise TypeError('Unsupported type %s' % typ)
87 ret.append(arg)
88 return ret
TypeError: Unsupported type <class 'numpy.ndarray'>
So, if we had empty_like(..., shape=)
maybe that would allow us to do copy_like
(without needing to actually add that to the NumPy API. Something like
if type(X) != type(beta):
beta2 = np.empty_like(X, shape=beta.shape)
beta2[:] = beta
I wanted to verify that assignment worked between numpy and cupy, but found that it didn't:
In [6]: y[:] = x
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-6-2b3b50afef58> in <module>
----> 1 y[:] = x
~/cupy/cupy/core/core.pyx in cupy.core.core.ndarray.__setitem__()
1682
1683 """
-> 1684 _scatter_op(self, slices, value, 'update')
1685
1686 def scatter_add(self, slices, value):
~/cupy/cupy/core/core.pyx in cupy.core.core._scatter_op()
3608 if op == 'update':
3609 if not isinstance(value, ndarray):
-> 3610 y.fill(value)
3611 return
3612 x = value
~/cupy/cupy/core/core.pyx in cupy.core.core.ndarray.fill()
520 if isinstance(value, numpy.ndarray):
521 if value.shape != ():
--> 522 raise ValueError(
523 'non-scalar numpy.ndarray cannot be used for fill')
524 value = value.item()
ValueError: non-scalar numpy.ndarray cannot be used for fill
This seems like the kind of things that cupy might accept as a PR. I was surprised to learn that this didn't work.
I actually had the exact same idea with the y[:] = x
assignment before, but I assumed the non-automatic conversion was intentional. I think if we can allow that to work, it would be a good solution, without even needing to touch NumPy, which seems better.
I will see if I can write a PR for that, hopefully it won't be too complicated.
It may be intentional, I'm not sure. You might raise a cupy issue to ask about it. I just checked their issue tracker and didn't see anything about it yet.
On Fri, Mar 1, 2019 at 9:23 AM Peter Andreas Entschev < [email protected]> wrote:
I actually had the exact same idea with the y[:] = x assignment before, but I assumed the non-automatic conversion was intentional. I think if we can allow that to work, it would be a good solution, without even needing to touch NumPy, which seems better.
I will see if I can write a PR for that, hopefully it won't be too complicated.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/dask/dask-glm/issues/73#issuecomment-468742667, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszGnH0qMedCd9-hR_AaBvSYX2PVJCks5vSWIJgaJpZM4bXsha .
I found two open issues related to what we just talked here:
https://github.com/cupy/cupy/issues/593 https://github.com/cupy/cupy/issues/589