sgkit icon indicating copy to clipboard operation
sgkit copied to clipboard

Run on Cubed

Open tomwhite opened this issue 1 year ago • 2 comments

This is an umbrella issue to track the work needed to run sgkit on Cubed.

This is possible because Cubed exposes the Python array API standard as well as common Dask functions and methods like map_blocks and Array.compute. Also, there is ongoing work to Integrate cubed in xarray, as a part of exploring alternative parallel execution frameworks in xarray.

tomwhite avatar Sep 22 '22 14:09 tomwhite

I've managed to get some basic aggregation tests in test_aggregation.py passing with the changes here: https://github.com/tomwhite/sgkit/commit/83ff40011b1c985cfca086d3fdf70edb371b3689. This is not to be merged as it's just a demonstration at the moment. Most of the changes are due to the array API being stricter on types (so it needs some explicit casts).

They rely on some changes in xarray too: https://github.com/pydata/xarray/pull/7067.

tomwhite avatar Sep 22 '22 14:09 tomwhite

Also, this example shows that Cubed works with Numba (locally at least), which answers @hammer's question here: https://github.com/pystatgen/sgkit/issues/885#issuecomment-1209288596.

tomwhite avatar Sep 22 '22 14:09 tomwhite