spatialdata icon indicating copy to clipboard operation
spatialdata copied to clipboard

Rename SpatialData elements

Open Fritze opened this issue 9 months ago • 2 comments

Is your feature request related to a problem? Please describe. The spatialdata_io.merscope reader has the two arguments 'region_name' and 'slide_name'. If set to None, file path directory is used. If set to a string, the order will always be slide_region_element. This can make element names more complicated and longer than needed.

Describe the solution you'd like A renaming function that works on individual elements within the SpatialData would be great, so e.g. spatialdata.rename(elements={"slide_region_transcripts": "just_transcripts"})

Describe alternatives you've considered One can re-built a SpatialData object from scratch by copying from the old one and thereby fix the names (described here). But ideally, one would do this without copying the data.

Fritze avatar Mar 18 '25 13:03 Fritze

+1 this would be a useful function

sophiamaedler avatar Mar 23 '25 15:03 sophiamaedler

Would also appreciate having the ability to easily rename tables. When trying to do a sd.concatenate([sdata1, sdata2]), you may run into a naming issue like so:

SpatialData object
├── Images
│     ├── '20004_cytassist_image': DataArray[cyx] (3, 2997, 3200)
│     ├── '20004_full_image': DataTree[cyx] (3, 19007, 71119), (3, 9503, 35559), (3, 4751, 17779), (3, 2375, 8889), (3, 1187, 4444)
│     ├── '20004_hires_image': DataArray[cyx] (3, 1604, 6000)
│     ├── '20004_lowres_image': DataArray[cyx] (3, 161, 600)
│     ├── '40001_cytassist_image': DataArray[cyx] (3, 3000, 3200)
│     ├── '40001_full_image': DataTree[cyx] (3, 16755, 45808), (3, 8377, 22904), (3, 4188, 11452), (3, 2094, 5726), (3, 1047, 2863)
│     ├── '40001_hires_image': DataArray[cyx] (3, 2195, 6000)
│     └── '40001_lowres_image': DataArray[cyx] (3, 220, 600)
├── Shapes
│     ├── '20004_square_002um': GeoDataFrame shape: (3399493, 1) (2D shapes)
│     ├── '20004_square_008um': GeoDataFrame shape: (433722, 1) (2D shapes)
│     ├── '20004_square_016um': GeoDataFrame shape: (151013, 1) (2D shapes)
│     ├── '40001_square_002um': GeoDataFrame shape: (3813253, 1) (2D shapes)
│     ├── '40001_square_008um': GeoDataFrame shape: (467052, 1) (2D shapes)
│     └── '40001_square_016um': GeoDataFrame shape: (160661, 1) (2D shapes)
└── Tables
      ├── 'square_002um_0': AnnData (3399493, 32285)
      ├── 'square_002um_1': AnnData (3813253, 32285)
      ├── 'square_008um_0': AnnData (433722, 32285)
      ├── 'square_008um_1': AnnData (467052, 32285)
      ├── 'square_016um_0': AnnData (151013, 32285)
      └── 'square_016um_1': AnnData (160661, 32285)

In my projects, I've used a naming schema where the first 5 numbers represents a particular sample (mouse or human subject ID). These are then used downstream for subsetting the sdata object (e.g. selecting a particular table with f{selected_sample}_square_{bin_size}um. This functionality will break, as there is an incremental integer appended to the name of the table. If we could rename / transform this element, it would be excellent

I've been hacking my way around this, but it would be a nice feature to add

pranavmishra90 avatar Sep 02 '25 17:09 pranavmishra90