dask-glm
dask-glm copied to clipboard
[WIP] Allow pure numpy array (not dask array) as inputs
Currently dask_glm.estimators
only accepts dask.array
as inputs due to the line below and other places where ._meta
is accessed without checking the data type.
https://github.com/dask/dask-glm/blob/7b2f85fe043eb29212755e67e33e3df553ed0e58/dask_glm/estimators.py#L67 https://github.com/dask/dask-glm/blob/7b2f85fe043eb29212755e67e33e3df553ed0e58/dask_glm/utils.py#L120-L124
Click to see the example code and error
Code:
from dask_glm.estimators import LogisticRegression
import numpy
x = numpy.random.rand(10,4)
y = numpy.random.rand(10)
lr = LogisticRegression()
lr.fit(x,y)
Error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-14-e644bf405118> in <module>
----> 1 lr.fit(x,y)
~/rapids/daskml_cupy/dask-glm/dask_glm/estimators.py in fit(self, X, y)
65 X_ = self._maybe_add_intercept(X)
66 fit_kwargs = dict(self._fit_kwargs)
---> 67 if is_dask_array_sparse(X):
68 fit_kwargs['normalize'] = False
69
~/rapids/daskml_cupy/dask-glm/dask_glm/utils.py in is_dask_array_sparse(X)
122 Check using _meta if a dask array contains sparse arrays
123 """
--> 124 return isinstance(X._meta, sparse.SparseArray)
125
126
AttributeError: 'numpy.ndarray' object has no attribute '_meta'
This PR allows numpy arrays (not dask numpy array) as input directly.
@mrocklin @pentschev I just added one test for now. If it is ok, could you please suggest which other tests I should add numpy input
? Thank you!
~I think I'm going to finish this first and then move on to #89~ Not really. I'll move on to #89