dask-glm
dask-glm copied to clipboard
Closes #63 xref https://github.com/dask/dask-ml/pull/94
Optimization can take less time is the gradient is approximated using a subset of the examples. The approach in detailed in "[Hybrid deterministic-stochastic methods for data fitting][1]", but more examples...
This PR adds an abstract base class for distribution families. To provide the same way to track subclasses as Regularizers, I added a RegistryClass that both abc's can inherit.
Ran into this for https://gist.github.com/TomAugspurger/30ec08cc29810b57b4cb4458828e46c9 Fixes https://github.com/dask/dask-glm/issues/13 (I think) One side issue: is it safe to assume that `X` will always be chunked *only* along the rows? Perhaps we should...
Previously admm would rechunk the columns to be in a single chunk, and then pass delayed numpy arrays to the local_update function. If the chunks along columns were of different...
The `environment.yml` doesn't match CI. For example, in `environment.yml` `python=3.5.2` is required but in CI, `python=3.7` is used.
After a bit of profiling, this is what I found out for Dask-GLM with Dask array: ``` 14339 0.139 0.000 0.814 0.000 /home/pentschev/.local/lib/python3.5/site-packages/dask/local.py:430(fire_task) 44898 19.945 0.000 19.945 0.000 {method 'acquire'...
Scikit-learn estimators have a `n_iter_` parameter. Dask-GLM doesn't. For example, the Scikit-Learn logistic regression implementation has the `n_iter_` parameter ([docs][sklearn]). However, the Dask-GLM implementation doesn't ([docs][dask]). This issue presented itself...
To allow for easy testing of CuPy and Dask during this phase, a copy of the existing relevant Dask tests was created under https://github.com/dask/dask-glm/pull/75. This incurs in a lot of...
For the past few weeks, I've been working on issues related to [NEP-18](https://www.numpy.org/neps/nep-0018-array-function-protocol.html) support for Dask (and ultimately, Dask-GLM as well) to allow CuPy to be used as a backend....