spatialdata icon indicating copy to clipboard operation
spatialdata copied to clipboard

Spatialdata object from remote .zarr store merges multiple tables during reading

Open adkinsrs opened this issue 11 months ago • 0 comments

Local representation of the SpatialData object when read in locally. This is a Visium HD dataset that I created originally using spatialdata_io.visium_hd + some post-processing stuff.

SpatialData object, with associated Zarr store: /<path>/11692b64-b34a-4dbe-adc9-784a87a7a856.zarr
├── Images
│     ├── 'spatialdata_hires_image': DataArray[cyx] (3, 4352, 6000)
│     └── 'spatialdata_lowres_image': DataArray[cyx] (3, 435, 600)
├── Shapes
│     └── 'spatialdata_square_008um': GeoDataFrame shape: (127839, 1) (2D shapes)
└── Tables
      ├── 'square_008um': AnnData (127839, 19059)
      └── 'table': AnnData (127839, 19059)
with coordinate systems:
    ▸ 'downscaled_hires', with elements:
        spatialdata_hires_image (Images), spatialdata_square_008um (Shapes)
    ▸ 'downscaled_lowres', with elements:
        spatialdata_lowres_image (Images), spatialdata_square_008um (Shapes)
    ▸ 'global', with elements:
        spatialdata_square_008um (Shapes)

Recommendation: attach a minimal working example Generally, the easier it is for us to reproduce the issue, the faster we can work on it. It is not required, but if you can, please:

Reproducible example

This is a public dataset and the datastore should be downloadable

import spatialdata as sd
rem_path = "https://devel.umgear.org/datasets/spatial/11692b64-b34a-4dbe-adc9-784a87a7a856.zarr"
sdata = sd.read_zarr(rem_path, selection=["images", "tables"])
print(sdata.tables)

Describe the bug

In this dataset I have two separate tables. The first is "square_008um" which was derived from the Visium HD dataset. The second is a copy called "table" for harmonization purposes, since we need to read in from different spatial platforms. Anyways when I read in from a local Zarr store, sdata.tables shows that there are two tables. However, when reading from a remote store, the "table" table seems to be flattened and merged into the "square_008um" table.

>>> sdata.tables
{'square_008um': AnnData object with n_obs × n_vars = 127839 × 19059
    obs: 'in_tissue', 'array_row', 'array_col', 'location_id', 'region', 'clusters'
    var: 'feature_types', 'genome', 'gene_symbol'
    uns: 'platform', 'spatialdata_attrs'
    obsm: 'spatial', 'table': AnnData object with n_obs × n_vars = 127839 × 19059
    obs: 'in_tissue', 'array_row', 'array_col', 'location_id', 'region', 'clusters'
    var: 'feature_types', 'genome', 'gene_symbol'
    uns: 'platform', 'spatialdata_attrs'
    obsm: 'spatial'}

This becomes an issue downstream where I run the following: adata = to_legacy_anndata(self.sdata, include_images=False, coordinate_system="downscaled_hires", table_name="table")

With this, I get the following error:

Traceback (most recent call last):
  File "/gEAR/lib/gear/spatialuploader.py", line 90, in _convert_sdata_to_adata
    adata = to_legacy_anndata(self.sdata, include_images=include_images, coordinate_system=self.coordinate_system, table_name=table_name)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/spatialdata_io/converters/legacy_anndata.py", line 129, in to_legacy_anndata
    assert table_name in sdata.tables, f"The table {table_name} is not present in the SpatialData object."
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: The table table is not present in the SpatialData object.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/tornado/ioloop.py", line 750, in _run_callback
    ret = callback()
          ^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/tornado/ioloop.py", line 774, in _discard_future_result
    future.result()
  File "/usr/local/lib/python3.12/site-packages/panel/io/server.py", line 140, in wrapped
    state._handle_exception(e)
  File "/usr/local/lib/python3.12/site-packages/panel/io/state.py", line 483, in _handle_exception
    raise exception
  File "/usr/local/lib/python3.12/site-packages/panel/io/server.py", line 138, in wrapped
    return await func(*args, **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/panel/param.py", line 865, in _eval_async
    raise e
  File "/usr/local/lib/python3.12/site-packages/panel/param.py", line 844, in _eval_async
    async for new_obj in awaitable:
  File "/usr/local/lib/python3.12/site-packages/panel/util/__init__.py", line 503, in to_async_gen
    value = await asyncio.to_thread(safe_next)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/concurrent/futures/thread.py", line 59, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/panel/util/__init__.py", line 498, in safe_next
    return next(sync_gen)
           ^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/param/reactive.py", line 572, in wrapped
    for val in evaled:
               ^^^^^^
  File "/gEAR/services/spatial/panel_app_remote.py", line 568, in init_data
    adata_pkg = pn.state.as_cached(adata_cache_label, create_adata_pkg, ttl=86400)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/panel/io/state.py", line 545, in as_cached
    ret, _ = self.cache[cache_key] = (fn(**kwargs), new_expiry)
                                      ^^^^^^^^^^^^
  File "/gEAR/services/spatial/panel_app_remote.py", line 560, in create_adata_pkg
    self.prep_adata()
  File "/gEAR/services/spatial/panel_app_remote.py", line 612, in prep_adata
    self.spatial_obj._convert_sdata_to_adata()
  File "/gEAR/lib/gear/spatialuploader.py", line 435, in _convert_sdata_to_adata
    return super()._convert_sdata_to_adata(include_images)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/gEAR/lib/gear/spatialuploader.py", line 92, in _convert_sdata_to_adata
    raise Exception("Error occurred while converting spatial data object to AnnData object: ", err)
Exception: ('Error occurred while converting spatial data object to AnnData object: ', AssertionError('The table table is not present in the SpatialData object.'))

Expected behavior The SpatialData object processes multiple tables correctly.

Desktop (optional):

  • Tested in MacOS Sequoia 15.3 as well as a Dockerized Ubuntu:jammy image

Additional context Relevant package versions. If you need me to go into a deeper dive, let me know

Python 3.12.7

spatialdata==0.3.0
spatialdata_io==0.1.6
pandas==2.2.1
anndata==0.10.6

adkinsrs avatar Feb 14 '25 15:02 adkinsrs