spatialdata icon indicating copy to clipboard operation
spatialdata copied to clipboard

Sparse Array in Labels?

Open Mr-Milk opened this issue 1 year ago • 2 comments

I wonder if it's possible to support sparse array to some degree in Labels. If it's a mask then the sparse format could save quite a lot of disk space.

Sparse array status in xarray: pydata/xarray#3213 Documentation: https://docs.xarray.dev/en/latest/user-guide/duckarrays.html

Mr-Milk avatar Aug 27 '24 12:08 Mr-Milk

Thank you for opening the discussion on sparse arrays for labels. Currently there is no plan to support this, mainly because sparse array are not part of the released OME-NGFF specifications https://ngff.openmicroscopy.org/latest/, nor of the enhancement proposals https://github.com/ome/ngff/issues?q=sort%3Aupdated-desc+is%3Aissue+is%3Aopen.

For an alternative approach, available already today, I suggest to convert the labels to collections of multipolygons (="shapes" object, represented as a geopandas.GeoDataFrame). You can use the functions to_polygons() and rasterize() from spatialdata for the conversions).

For a long term approach, I kindly ask you to open the discussion also in the ngff repo (same link as above).

I hope this helps!

LucaMarconato avatar Aug 27 '24 14:08 LucaMarconato

May I ask why spatialdata need follow the OME-NGFF specification?

Mr-Milk avatar Aug 29 '24 12:08 Mr-Milk

We believe that standardization of file formats will make it easier in the long term to enabled cross-interoperable workflows. Also, vendors would have an incentive in producing directly data in a standard format.

Currently, we are still not 100% NGFF compliant because we needed some additional features, but we are working on either contributing some of our ideas to NGFF, either clearly defining in a specification document how we complement missing features from NGFF using external technologies (e.g. GeoParquet).

Back to your question. I think that having sparse arrays in labels would benefit not only the spatial omics community but also the bioimaging community, so I would definitely consider starting the discussion within NGFF, where it can have more visibility.

LucaMarconato avatar Oct 01 '24 12:10 LucaMarconato