Spatialdata object from remote .zarr store merges multiple tables during reading
Local representation of the SpatialData object when read in locally. This is a Visium HD dataset that I created originally using spatialdata_io.visium_hd + some post-processing stuff.
SpatialData object, with associated Zarr store: /<path>/11692b64-b34a-4dbe-adc9-784a87a7a856.zarr
├── Images
│ ├── 'spatialdata_hires_image': DataArray[cyx] (3, 4352, 6000)
│ └── 'spatialdata_lowres_image': DataArray[cyx] (3, 435, 600)
├── Shapes
│ └── 'spatialdata_square_008um': GeoDataFrame shape: (127839, 1) (2D shapes)
└── Tables
├── 'square_008um': AnnData (127839, 19059)
└── 'table': AnnData (127839, 19059)
with coordinate systems:
▸ 'downscaled_hires', with elements:
spatialdata_hires_image (Images), spatialdata_square_008um (Shapes)
▸ 'downscaled_lowres', with elements:
spatialdata_lowres_image (Images), spatialdata_square_008um (Shapes)
▸ 'global', with elements:
spatialdata_square_008um (Shapes)
Recommendation: attach a minimal working example Generally, the easier it is for us to reproduce the issue, the faster we can work on it. It is not required, but if you can, please:
Reproducible example
This is a public dataset and the datastore should be downloadable
import spatialdata as sd
rem_path = "https://devel.umgear.org/datasets/spatial/11692b64-b34a-4dbe-adc9-784a87a7a856.zarr"
sdata = sd.read_zarr(rem_path, selection=["images", "tables"])
print(sdata.tables)
Describe the bug
In this dataset I have two separate tables. The first is "square_008um" which was derived from the Visium HD dataset. The second is a copy called "table" for harmonization purposes, since we need to read in from different spatial platforms. Anyways when I read in from a local Zarr store, sdata.tables shows that there are two tables. However, when reading from a remote store, the "table" table seems to be flattened and merged into the "square_008um" table.
>>> sdata.tables
{'square_008um': AnnData object with n_obs × n_vars = 127839 × 19059
obs: 'in_tissue', 'array_row', 'array_col', 'location_id', 'region', 'clusters'
var: 'feature_types', 'genome', 'gene_symbol'
uns: 'platform', 'spatialdata_attrs'
obsm: 'spatial', 'table': AnnData object with n_obs × n_vars = 127839 × 19059
obs: 'in_tissue', 'array_row', 'array_col', 'location_id', 'region', 'clusters'
var: 'feature_types', 'genome', 'gene_symbol'
uns: 'platform', 'spatialdata_attrs'
obsm: 'spatial'}
This becomes an issue downstream where I run the following:
adata = to_legacy_anndata(self.sdata, include_images=False, coordinate_system="downscaled_hires", table_name="table")
With this, I get the following error:
Traceback (most recent call last):
File "/gEAR/lib/gear/spatialuploader.py", line 90, in _convert_sdata_to_adata
adata = to_legacy_anndata(self.sdata, include_images=include_images, coordinate_system=self.coordinate_system, table_name=table_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/spatialdata_io/converters/legacy_anndata.py", line 129, in to_legacy_anndata
assert table_name in sdata.tables, f"The table {table_name} is not present in the SpatialData object."
^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: The table table is not present in the SpatialData object.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/tornado/ioloop.py", line 750, in _run_callback
ret = callback()
^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/tornado/ioloop.py", line 774, in _discard_future_result
future.result()
File "/usr/local/lib/python3.12/site-packages/panel/io/server.py", line 140, in wrapped
state._handle_exception(e)
File "/usr/local/lib/python3.12/site-packages/panel/io/state.py", line 483, in _handle_exception
raise exception
File "/usr/local/lib/python3.12/site-packages/panel/io/server.py", line 138, in wrapped
return await func(*args, **kw)
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/panel/param.py", line 865, in _eval_async
raise e
File "/usr/local/lib/python3.12/site-packages/panel/param.py", line 844, in _eval_async
async for new_obj in awaitable:
File "/usr/local/lib/python3.12/site-packages/panel/util/__init__.py", line 503, in to_async_gen
value = await asyncio.to_thread(safe_next)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/asyncio/threads.py", line 25, in to_thread
return await loop.run_in_executor(None, func_call)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/concurrent/futures/thread.py", line 59, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/panel/util/__init__.py", line 498, in safe_next
return next(sync_gen)
^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/param/reactive.py", line 572, in wrapped
for val in evaled:
^^^^^^
File "/gEAR/services/spatial/panel_app_remote.py", line 568, in init_data
adata_pkg = pn.state.as_cached(adata_cache_label, create_adata_pkg, ttl=86400)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/panel/io/state.py", line 545, in as_cached
ret, _ = self.cache[cache_key] = (fn(**kwargs), new_expiry)
^^^^^^^^^^^^
File "/gEAR/services/spatial/panel_app_remote.py", line 560, in create_adata_pkg
self.prep_adata()
File "/gEAR/services/spatial/panel_app_remote.py", line 612, in prep_adata
self.spatial_obj._convert_sdata_to_adata()
File "/gEAR/lib/gear/spatialuploader.py", line 435, in _convert_sdata_to_adata
return super()._convert_sdata_to_adata(include_images)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/gEAR/lib/gear/spatialuploader.py", line 92, in _convert_sdata_to_adata
raise Exception("Error occurred while converting spatial data object to AnnData object: ", err)
Exception: ('Error occurred while converting spatial data object to AnnData object: ', AssertionError('The table table is not present in the SpatialData object.'))
Expected behavior The SpatialData object processes multiple tables correctly.
Desktop (optional):
- Tested in MacOS Sequoia 15.3 as well as a Dockerized Ubuntu:jammy image
Additional context Relevant package versions. If you need me to go into a deeper dive, let me know
Python 3.12.7
spatialdata==0.3.0
spatialdata_io==0.1.6
pandas==2.2.1
anndata==0.10.6