spatialdata
spatialdata copied to clipboard
Adding `attrs` at the `SpatialData` object level
As described in issue #404, it is useful to have attributes at the SpatialData object level. This will also be useful in Sopa, to simplify the API.
Usage:
import spatialdata
from spatialdata.datasets import blobs
sdata = blobs()
sdata.attrs # empty dictionnary
sdata.attrs["test"] = 12
sdata.write("blobs.zarr") # this saves the attributes on disk
sdata = spatialdata.read_zarr("blobs.zarr") # the attributes are read normally
sdata.attrs["test"] # returns 12
What do you think @LucaMarconato?
Codecov Report
Attention: Patch coverage is 88.60759% with 9 lines in your changes missing coverage. Please review.
Project coverage is 91.78%. Comparing base (
2a0c23a) to head (4546bea). Report is 70 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #711 +/- ##
==========================================
- Coverage 91.80% 91.78% -0.03%
==========================================
Files 45 45
Lines 6959 7045 +86
==========================================
+ Hits 6389 6466 +77
- Misses 570 579 +9
| Files with missing lines | Coverage Δ | |
|---|---|---|
| src/spatialdata/_core/_deepcopy.py | 98.38% <100.00%> (+0.02%) |
:arrow_up: |
| src/spatialdata/_core/concatenate.py | 91.80% <100.00%> (+0.20%) |
:arrow_up: |
| src/spatialdata/_core/operations/_utils.py | 92.15% <100.00%> (ø) |
|
| src/spatialdata/_core/operations/rasterize.py | 90.45% <100.00%> (ø) |
|
| src/spatialdata/_core/operations/transform.py | 90.82% <100.00%> (ø) |
|
| src/spatialdata/_core/query/spatial_query.py | 95.50% <100.00%> (ø) |
|
| src/spatialdata/_io/io_zarr.py | 87.77% <75.00%> (-0.60%) |
:arrow_down: |
| src/spatialdata/_core/spatialdata.py | 90.87% <92.00%> (-0.01%) |
:arrow_down: |
| src/spatialdata/_io/format.py | 83.22% <71.42%> (-1.13%) |
:arrow_down: |
Thanks for the PR.
- [x] before merging we need to update the format version; we probably should add a format for the
SpatialDatacontainer; something asked also by several users (we already have a format for the single elements, which would be independent from the format of the container).
Hi @LucaMarconato, have you started working on your last comment? If not, do you want me to have a look (although I'm not exactly sure to understand what you want to do)?
I noticed we are missing a feature: after we save a SpatialData object, if we add a new attr, there is currently no way to save it on disk. Something like sdata.write_element for the attrs
Just a random note on this: I agree that having SpatialData - level attributes would be super helpful. Beyond that, I think it would be great to be able to annotate coordinate systems - these often correspond to samples, conditions, time-points etc. I can of course annotate all elements within a coordinate system through tables, but I think it would be more direct to somehow directly provide metadata for an entire coordinate system.
I also noticed that we want to pass the attrs inside certain functions, e.g. when performing a query (the attrs should be kept). I made a commit for this
The only concern is when we concatenate multiple SpatialData objects, how should we merge the duplicate keys (if any)?
NB: I updated the first post with a check-list
@quentinblampey thanks for the update. I haven't updated the format yet, we can work on it this week at the hackathon. Regarding saving to disk, I suggest to extend the write_metadata() method. Finally, regarding .attrs, we could proceed as AnnData does for .uns when calling ad.concat().
Extremely interested in this, thanks @quentinblampey !!
@quentinblampey I have finalized this PR, I'm going to merge it now. Thanks for your contribution!
Two comments.
- root-level versioning: On disk, I am now adding a version number for the root level metadata, and also I am adding the string of the spatialdata software version that was used to create the metadata (this helps reproducibility). When reading the
.zattrsfrom disk into theSpatialData.attrsslow I am not parsing the two version strings not to clutter the object. .attrspassed by reference: I have removed the codedict(sdata.attrs)and passing the attrs always by reference. The only exceptions are when I callspatialdata.deepcopy(), where I callcopy.deepcopy(), and when the user passes aMappingthat is not adict. In such case the casting to dict performs a shallow copy.