spatialdata icon indicating copy to clipboard operation
spatialdata copied to clipboard

concatenate tables or not

Open wangjiawen2013 opened this issue 1 year ago • 1 comments

Hi, This is a visium_HD object from spatialdata documentation, there are three tables in this object:

SpatialData object with:
├── Images
│     ├── 'Visium_HD_Mouse_Small_Intestine_cytassist_image': SpatialImage[cyx] (3, 3000, 3200)
│     ├── 'Visium_HD_Mouse_Small_Intestine_full_image': MultiscaleSpatialImage[cyx] (3, 21943, 23618), (3, 10971, 11809), (3, 5485, 5904), (3, 2742, 2952), (3, 1371, 1476)
│     ├── 'Visium_HD_Mouse_Small_Intestine_hires_image': SpatialImage[cyx] (3, 5575, 6000)
│     └── 'Visium_HD_Mouse_Small_Intestine_lowres_image': SpatialImage[cyx] (3, 558, 600)
├── Shapes
│     ├── 'Visium_HD_Mouse_Small_Intestine_square_002um': GeoDataFrame shape: (5479660, 2) (2D shapes)
│     ├── 'Visium_HD_Mouse_Small_Intestine_square_008um': GeoDataFrame shape: (351817, 2) (2D shapes)
│     └── 'Visium_HD_Mouse_Small_Intestine_square_016um': GeoDataFrame shape: (91033, 2) (2D shapes)
└── Tables
      ├── 'square_002um': AnnData (5479660, 19059)
      ├── 'square_008um': AnnData (351817, 19059)
      └── 'square_016um': AnnData (91033, 19059)
with coordinate systems:
▸ 'downscaled_hires', with elements:
        Visium_HD_Mouse_Small_Intestine_hires_image (Images), Visium_HD_Mouse_Small_Intestine_square_002um (Shapes), Visium_HD_Mouse_Small_Intestine_square_008um (Shapes), Visium_HD_Mouse_Small_Intestine_square_016um (Shapes)
▸ 'downscaled_lowres', with elements:
        Visium_HD_Mouse_Small_Intestine_lowres_image (Images), Visium_HD_Mouse_Small_Intestine_square_002um (Shapes), Visium_HD_Mouse_Small_Intestine_square_008um (Shapes), Visium_HD_Mouse_Small_Intestine_square_016um (Shapes)
▸ 'global', with elements:
        Visium_HD_Mouse_Small_Intestine_cytassist_image (Images), Visium_HD_Mouse_Small_Intestine_full_image (Images), Visium_HD_Mouse_Small_Intestine_square_002um (Shapes), Visium_HD_Mouse_Small_Intestine_square_008um (Shapes), Visium_HD_Mouse_Small_Intestine_square_016um (Shapes)

And This is a xenium object from spatialdata documentation, there is only one table in this object:

SpatialData object with:
├── Images
│     ├── 'he_image': MultiscaleSpatialImage[cyx] (3, 45087, 11580), (3, 22543, 5790), (3, 11271, 2895), (3, 5635, 1447), (3, 2817, 723)
│     └── 'morphology_focus': MultiscaleSpatialImage[cyx] (5, 17098, 51187), (5, 8549, 25593), (5, 4274, 12796), (5, 2137, 6398), (5, 1068, 3199)
├── Labels
│     ├── 'cell_labels': MultiscaleSpatialImage[yx] (17098, 51187), (8549, 25593), (4274, 12796), (2137, 6398), (1068, 3199)
│     └── 'nucleus_labels': MultiscaleSpatialImage[yx] (17098, 51187), (8549, 25593), (4274, 12796), (2137, 6398), (1068, 3199)
├── Points
│     └── 'transcripts': DataFrame with shape: (12165021, 11) (3D points)
├── Shapes
│     ├── 'cell_boundaries': GeoDataFrame shape: (162254, 1) (2D shapes)
│     ├── 'cell_circles': GeoDataFrame shape: (162254, 2) (2D shapes)
│     └── 'nucleus_boundaries': GeoDataFrame shape: (156628, 1) (2D shapes)
└── Tables
      └── 'table': AnnData (162254, 377)
with coordinate systems:
▸ 'global', with elements:
        he_image (Images), morphology_focus (Images), cell_labels (Labels), nucleus_labels (Labels), transcripts (Points), cell_boundaries (Shapes), cell_circles (Shapes), nucleus_boundaries (Shapes)

My question is, when I have multiple spatial transcriptome datasets, should I concatenate the tables or not. If the tables are concatenate, the downstream analysis, such as PCA,UMAP will be affected, because the neighbours of each cell will be different from that in separated tables. It's common to concatenate multiple single cell datasets, but I don't know whether it makes sense to concatenate spatial transcriptome tables.

wangjiawen2013 avatar Jul 12 '24 01:07 wangjiawen2013

In these cases I would not concatenate the tables for the following reasons:

  • The Xenium table is at the single-cell level and the Visium HD table at the bin levels, so concatenating the tables, even if for an hypothetical dataset constructed from the same tissue slide, would lead to technical artifacts
  • The Visium HD tables are referring to different bin sizes, so the tables have different meanings.

My advice is that in general table concatenation should be considered when:

  1. there are multiple different samples from the same technology (e.g. multiple Xenium samples)
  2. when there are two technologies for the same tissue. Here (crucial), after identifying the same entities in both datasets (e.g. by overlapping the cell boundaries from one dataset to the other), one could combine the two matrices on the 1 axis (merging the var). In this case using muon could be of help.

LucaMarconato avatar Jul 12 '24 14:07 LucaMarconato