Deepak Cherian

http://www.cherian.net

@earth-mover Golden

Results 1084 comments of


                                            Deepak Cherian

einsum will increase the dimension of tensor when we split the legs which will be contracted, that may cause out of memory

@jrbourbeau this kind of thing is what might cause regridding blowups xref https://github.com/dask/dask/issues/2225 cc @maxrjones

Add histogram method

> Absolute speed of xhistogram appears to be 3-4x higher, and that's using numpy_groupies in flox. Possibly flox could be faster if using numba but not sure yet. Nah, in...

Add histogram method

This could basically be something like ``` ds.notnull().groupby(x=BinGrouper(...), y=BinGrouper(...), enso_phase=UniqueGrouper(...)).sum() # TODO: handle `density` ``` We'll need https://github.com/pydata/xarray/pull/9522 + some skipping of `_ensure_1d` in `GroupBy.__init__` to handle the case of...

`rolling(...).construct(...)` blows up chunk size

This is using the `sliding_window_view` trick under the hood, which composes badly with anything that does a memory copy (like `weighted` in your example) https://github.com/dask/dask/blob/d45ea380eb55feac74e8146e8ff7c6261e93b9d7/dask/array/overlap.py#L808 We actually use this approach...

`rolling(...).construct(...)` blows up chunk size

I support the approach, but it'd be good to see the impact on `ds.rolling().mean()` which also uses `construct` but is clever about it to avoid the memory blowup.

`rolling(...).construct(...)` blows up chunk size

Yes, https://github.com/pydata/xarray/issues/3937, but we've struggled to move on that. `construct` is a pretty useful escape hatch for custom workloads, so we should optimize for it behaving sanely.

Using the shuffle primitive in Xarray

> new API that would simply be a combination of shuffle and other, existing methods. the equivalent would be a little involved: ``` shuffled = ds.shuffle(grouper) mapped = xr.map_blocks(lambda x:...

Comprehensive benchmarking suite

Thanks @scottyhq > One other thing that often gets neglected in test suites is operating on remote data. This is lining up with the "pangeo integration tests" that came up...

Comprehensive benchmarking suite

Looks like Quansight thinks that GH actions is a good place to benchmark scikit-learn: https://labs.quansight.org/blog/2021/08/github-actions-benchmarks/ so may be we can set that up for our existing benchmarks. Here's the workflow:...

Comprehensive benchmarking suite

@TomAugspurger are you still in charge of the pydata benchmarking machine? If so, could you add xarray to the list please (https://pandas.pydata.org/speed/)? @Illviljan has made major improvements so it should...

‹
1
2
...
85
86
87
88
89
90
91
...
108
109
›