cubed icon indicating copy to clipboard operation
cubed copied to clipboard

Bounded-memory serverless distributed N-dimensional array processing

Results 133 cubed issues
Sort by recently updated
recently updated
newest added

I noticed you're using `concatenate2` from dask. I've found this to be quite wasteful, it's cool in its recursive ability but sucks in that it repeatedly allocates new memory. Cubed...

On some systems there are scenarios where we know we can perform a rechunk entirely in-memory without writing to disk. For example if I locally run this test which performs...

enhancement
memory
optimization
storage

It would be awesome if the backing array implementation supported auto differentiation, that we could access some `grad` method from Cubed. It looks like a bunch of stakeholder libraries have...

This is an umbrella issue for tracking the work for making Cubed work better on a single machine. ## Processes executor Improvements to the `processes` executor - [x] #507 -...

runtime
single-machine

I suspect that the performance of the ThreadPoolExecutor would substantially increase if we strategically placed cython `with nogil` calls. - https://cython.readthedocs.io/en/latest/src/userguide/parallelism.html - https://stackoverflow.com/questions/49047255/cython-nogil-with-threadpoolexecutor-not-giving-speedups - https://stackoverflow.com/questions/56537989/usage-of-threadpoolexecutor-in-conjunction-with-cythons-nogil There are drawbacks to process...

@TomNicholas mentioned that people at SciPy asked how to apply an arbitrary function to arrays in Cubed. An example would help users get started with this. It should cover how...

examples

Tile based operations have been quite a success for creating optimal GPU kernels. The programming model, in my understanding, offers flexibility while taking advantage of cache hierarchies. http://www.eecs.harvard.edu/~htk/publication/2019-mapl-tillet-kung-cox.pdf The [triton...

While I'm not familiar with the Philox pseudo-random number generator (PRNG) in Numpy (it does look well suited to generation in a distributed setting), I think adopting a stateless PRNG...

Could Spark be added as a supported executor? Maybe RDD.map or RDD.mapPartitions would be the correct way to map a function similar to [`map_unordered`](https://github.com/cubed-dev/cubed/blob/main/cubed/runtime/executors/lithops.py#L190) in the Lithops executor. https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.RDD.mapPartitions.html#pyspark.RDD.mapPartitions To...

runtime

When running [some benchmarks](https://github.com/cubed-dev/cubed/issues/492#issuecomment-2238908343) recently I noticed that turning off Zarr compression resulted in faster IO performance when writing random data to Zarr files on a local SSD. This [post](https://medium.com/@lubonjaariel/to-compress-or-not-to-compress-a-zarr-question-812160b3777d)...

zarr
storage
single-machine