cubed issues

Document optimization changes

Update the documentation to reflect the changes due to the work in #339. This should include end user docs on how to configure/control optimization, as well as more developer-focused design...

tomwhite

documentation

optimization

Jax integration

10

Can the core array API ops of cubed be implemented in jax, s.t. everything easily compiles to accelerators? Could this solve the common pain point of running out of GPU...

alxmrs

Profile Cubed task memory usage with Memray

1

I've been using [Memray](https://bloomberg.github.io/memray/) to profile the memory usage of Cubed tasks running locally using a local Lithops executor (since it runs tasks in a separate process). This give great...

tomwhite

runtime

memory

[pre-commit.ci] pre-commit autoupdate

updates: - [github.com/pycqa/isort: 5.12.0 → 5.13.2](https://github.com/pycqa/isort/compare/5.12.0...5.13.2) - [github.com/psf/black: 23.10.1 → 24.1.1](https://github.com/psf/black/compare/23.10.1...24.1.1) - [github.com/pycqa/flake8: 6.1.0 → 7.0.0](https://github.com/pycqa/flake8/compare/6.1.0...7.0.0)

pre-commit-ci[bot]

Query optimization

3

There are a couple of possibilities for doing query optimization that have come up recently. [Dask-expr](https://github.com/dask-contrib/dask-expr) will support arrays soon (https://github.com/dask-contrib/dask-expr/issues/446). It would be interesting to see if the expression...

tomwhite

help wanted

array api

optimization

Estimate monetary cost of executing plan

4

Cubed arguably has enough information to give a rough estimate of the monetary cost of executing the plan before starting execution. I'm imagining a new method `.estimate_cost(executor)` that is similar...

TomNicholas

enhancement

Low-latency storage backends

24

So far we've only used cloud storage (S3 and GCS) for storing intermediate Zarr data in Cubed. It would be interesting to try other storage backends that have lower latency....

tomwhite

runtime

zarr

storage

Move main array namespace

4

Currently all the array functions are in `cubed.array_api`. This was created to follow the naming pattern for the new array API in `numpy.array_api`. Since then, however, there has been a...

tomwhite

array api

Optimizing the shuffle

3

Cubed currently always implements the shuffle operation as an all-to-all rechunking using the [algorithm from rechunker](https://rechunker.readthedocs.io/en/latest/algorithm.html). This creates an intermediate persistent Zarr store, and requires all chunks to be written...

TomNicholas

help wanted

primitive

optimization

Lithops version mismatch

3

I tried to re-run the quadratic means example with recent improvements to Cubed but got stuck on a Lithops version mismatch error ``` Exception: Lithops version mismatch. Host version 2.9.0...

TomNicholas

cubed
cubed copied to clipboard

Metadata

Document optimization changes

Jax integration

Profile Cubed task memory usage with Memray

[pre-commit.ci] pre-commit autoupdate

Query optimization

Estimate monetary cost of executing plan

Low-latency storage backends

Move main array namespace

Optimizing the shuffle

Lithops version mismatch

← Metadata

Owner

Metadata

cubed cubed copied to clipboard

Metadata

← Metadata

Owner

Metadata

cubed
cubed copied to clipboard