Discussion: how incremental addition/modification of elements and IO interact
This is to collect points to discuss in major redesign of how IO interacts with the capabilities of extending an object by adding/removing or modifying particular elements. In other words, we need to decide how the user can modify/extend a SpatialData object after its constructor is called, and how this relates with IO.
It evolves from previous discussions in this old issue https://github.com/scverse/spatialdata/issues/137 (and it's related PR https://github.com/scverse/spatialdata/pull/138).
I'll just dump all the points and reformat later on.
- comments on
add_image(), add_labels()- these functions are immediately writing to disk when called. We could decide removing these functions and let the user write to disk explicitly. But it would be cool to know what is to be written to disk, and what not.
- the functions also check if the name of the image being added is already present in
sdata.imagesor not, or if it is unique across all elements. This check is important, but currently it can be overridden by simply doingsdata.images['my_image'] = image. We should decide on one of the following:- we make
.imagesprivate and we removesdata.add_images('my_image', image). The user will not usesdata.images['my_image']anymore, leaving only with the option to dosdata['my_image'] = image, which internally usesget_schema()to know whatimageis. - we replace
.imageswith an accessor class for which we override__set_item__()and__get_item__(), and we make these__set_item__()calladd_image()internally.
- we make
We also need functions to remove elements, not just in-memory but also on disk.
In the end we went for the accessor option and we separated the phase of adding new elements in-memory and writing them to disk.
Semi related, and might be of interest:
- https://github.com/scverse/anndata/issues/1403
Basically, making it easy to cache new elements to disk without modifying any existing on disk data.
Thanks for sharing, very interesting!