Tom Augspurger issues

Results 181 issues of


                                            Tom Augspurger

Run helm upgrade with --dry-run on pull requests

That would have caught the error causing the failure on staging right now :)

Write down policy on adding triage members.

GitHub has a new triage role. We should decide what it takes to become a triager and document that policy here.

Updates to rasp-data-loading.ipynb

I'm playing with the example from https://github.com/pangeo-data/ml-workflow-examples/pull/2. See https://nbviewer.jupyter.org/gist/TomAugspurger/f23c5342bef938a120b83a11d1cae077 for the updates. On this subset, it seems like the dask + xarray overhead over h5py is about 2x. I think...

Deprecate estimators

Closes #63 xref https://github.com/dask/dask-ml/pull/94

ENH: Allow add_intercept for unknown dims

Ran into this for https://gist.github.com/TomAugspurger/30ec08cc29810b57b4cb4458828e46c9 Fixes https://github.com/dask/dask-glm/issues/13 (I think) One side issue: is it safe to assume that `X` will always be chunked *only* along the rows? Perhaps we should...

Intercepts should not be regularized

xref https://github.com/dask/dask-ml/issues/84#issuecomment-34377215

Package reorganization

It'd be good to clarify the boundaries of dask-glm and dask-ml. My motivation is building up a set of utilities in dask-ml for working generically with dask or NumPy arrays,...

From pandas

Taking Matt's idea > Are there benchmarks in Pandas that are appropriate to take? Here's a bunch from some of https://github.com/pandas-dev/pandas/tree/master/asv_bench/benchmarks All of these at least run. I need to...

Benchmarks wishlist

What high-level areas do we want coverage in? - [x] dask.order.order - [ ] Anything from dask.core? - [x] dask.optimization? - [ ] task stealing? - [ ] scheduler throughput...

Documentation

This is a sketch for some sections of documentation that should go in the README. ## What to test? Ideally, benchmarks measure how long *our project* (dask, distributed) spends doing...