dask-ml icon indicating copy to clipboard operation
dask-ml copied to clipboard

Scalable Machine Learning with Dask

Results 137 dask-ml issues
Sort by recently updated
recently updated
newest added

- Closes: https://github.com/dask/dask-ml/issues/734. - Tests/`black`/`pyflakes` passing: Yes. - Tests added to the following files:** - `tests/test_pca.py` - `tests/test_incremental_pca.py` - tl;dr of design choices made: - `IncrementalPCA` doesn't get affected by...

We need to add an `n_features_in_` attribute to our estimators. https://github.com/scikit-learn/scikit-learn/pull/16112 / https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest/slep010/proposal.html https://dev.azure.com/dask-dev/dask/_build/results?buildId=1080&view=logs&j=d699f7f7-fbb1-5fe7-7f49-16a980a27dcf&t=47702d4b-eb38-52c1-7644-7bfb10716412

Updates, associated with this PR: - The original intention was to only pin code checking-related versions to their [dask](https://github.com/dask/dask/blob/main/.pre-commit-config.yaml) and [distributed](https://github.com/dask/distributed/blob/main/.pre-commit-config.yaml) equivalents (as per https://github.com/dask/dask/pull/7256 and https://github.com/dask/distributed/pull/4533). - However, `isort`...

While working on a recent contribution, I followed the directions at https://ml.dask.org/contributing.html#documentation to re-generate this project's documentation. On `main`, that generates many warnings and errors. I think resolving these warnings...

good first issue

Fix #787. Also related to #779

A proposed solution to #746. I ran tests like this to confirm it works across expected values: ``` for i in np.arange(0.01, 1.0, 0.0001): print(round(1-i, 6)) ```

This PR proposes the introduction of a PR template, following the same structure as those in [dask](https://github.com/dask/dask/blob/main/.github/PULL_REQUEST_TEMPLATE.md) and [distributed](https://github.com/dask/distributed/blob/main/.github/PULL_REQUEST_TEMPLATE.md).

**What happened**: When running hyperparameter search with sklearn's RandomizedGridSearch and DaskXGBoostClassifier, get the following error: ``` --------------------------------------------------------------------------- TypeError Traceback (most recent call last) ~/anaconda3/envs/daskml3/lib/python3.7/site-packages/sklearn/utils/validation.py in _num_samples(x) 209 try: --> 210...

I took a stab at implementing a solution for issue #535 Adding a WIP label because currently the stratified split is not completely lazily for dask arrays (`compute_chunk_sizes` being called...

The main changes in this PR are: 1. change to the async def _fit function in `dask_ml.model_selection._incremental.py` to allow Hyperband to work with non dask arrays 2. fixed default `test_size`...