Results 181 issues of Tom Augspurger

The Jupyterhub helm chart includes a `hub.baseUrl` for serving the hub under a path other than `/`, e.g. `/jupyter/`. I believe this breaks the automatic API setting at https://github.com/dask/dask-gateway/blob/7d1659db8b2122d1a861ea820a459fd045fc3f02/resources/helm/dask-gateway/templates/gateway/configmap.yaml#L97. I'm...

bug
documentation

Over in https://github.com/pangeo-data/pangeo-binder/issues/143, we've noticed that some dask-gateway pods hang around. The root cause is likely something like https://github.com/ipython/ipykernel/issues/462. But even if that's fixed, there might be cases like a...

Right now, IIUC, to create a cluster using the button the config takes a python class, args, and kwargs to create the cluster. This isn't flexible enough for dask-gateway, which...

enhancement
help wanted

https://github.com/dask/dask-ml/blob/d5801584d092d8f13f1b38aaf4da5dc3caa6a213/dask_ml/datasets.py#L332 isn't great, especially in settings like Hyperband #221, that are using the distributed scheduler. We could probably replace ```python rng = dask_ml.utils.check_random_state(random_state) ``` with ```python rng = sklearn.utils.check_random_state(random_state) ```...

good first issue

https://github.com/dask/dask-ml/runs/6626696391?check_suite_focus=true ```pytb =================================== FAILURES =================================== _____________________________ test_check_estimator _____________________________ def test_check_estimator(): with warnings.catch_warnings(record=True): warnings.simplefilter("ignore", RuntimeWarning) > check_estimator(DKKMeans()) tests/test_kmeans.py:27: _ _ _ _ _ _ _ _ _ _ _ _ _...

https://github.com/dask/dask-ml/pull/863/files#diff-970a9a902963a060977ac1912b5a86a5177cebfd468517967f63335bb6d024eaR435-R437 added an empty `check_consistent_lengths`. We should implement a real one. Unlike scikit-learn, we don't actually want to check the lengths when constructing the graphs. But we do want to...

https://github.com/dask/dask-ml/pull/863 added `dask_ml.base.DaskMLBaseMixin`, which implements `_validate_data`. We should use that in more models for consistent data validation and feature name handling. Currently, we just using it in `preprocessing/data.py`

Should we add a `compute` keyword to all the `fit` / `partial_fit` on estimators we implement? This would aid with - debugging the graphs we build - Better scheduling (any...

Roadmap

We need to add an `n_features_in_` attribute to our estimators. https://github.com/scikit-learn/scikit-learn/pull/16112 / https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest/slep010/proposal.html https://dev.azure.com/dask-dev/dask/_build/results?buildId=1080&view=logs&j=d699f7f7-fbb1-5fe7-7f49-16a980a27dcf&t=47702d4b-eb38-52c1-7644-7bfb10716412

This cleans up code duplication (estimators, sparse handling) and merges in https://github.com/dask/dask-glm/pull/79. I'm hoping that by focusing things here I'll be able to stay on top of things better. This...