squidpy icon indicating copy to clipboard operation
squidpy copied to clipboard

Comparisons between groups of samples

Open grst opened this issue 1 year ago • 6 comments

Most functions in squidpy seem to be centered around analyzing a single sample. For me, in the context of clinical trials, it would be important to compare between sample groups. The usual variables of interest are

  • comparisons of samples over time (pre-treatment vs. on-treatment), i.e. "how does our drug influence the tissue?" and
  • comparisons of responders vs. non-responders, i.e. "What could explain treatment failure?"

A few things that came to my mind

  • Differential ligand/receptor interactions between groups
  • Differential neighborhood analysis (e.g. tumor more infiltrated after treatment?)
  • differential spatial distance (e.g. tumor and certain immune cells closer together in responders?)

Would be great to have dedicated functions for this, and happy about further ideas!

CC @Zethson, because this is basically "spatial pertpy"


Example data

  • Myocardial infarction, AMI vs. ICM vs. control; https://www.nature.com/articles/s41586-022-05060-x#additional-information
  • Cellcharter mouse cortex data: adata = cc.datasets.codex_mouse_spleen('./data/codex_mouse_spleen.h5ad') -> 3 vs. 6

Collection of use-cases and implementation ideas

Differential neighborhood enrichment

Sharing here a very simple approach that estimates for each "niche" (e.g. tumor spots), the average neighborhood (i.e. across all spots of that niche, what's the fraction of neighboring niches). This allows to distinguish, for instance, between immune-infiltrated and immune-excluded tumors.

# adata_vis -> AnnData with visium data, contains multiple samples
# "niche_leiden" -> categorical annotation of niches
res2 = []
for sample in samples:
    tmp_ad = select_slide(adata_vis, sample)
    sq.gr.spatial_neighbors(tmp_ad)

    res = []
    for spot in range(tmp_ad.shape[0]):
        spot_neighbors = tmp_ad.obsp["spatial_distances"][spot, :].todense().A1.astype(bool)
        current_niche = tmp_ad.obs["niche_leiden"][spot]
        res.append(tmp_ad.obs["niche_leiden"][spot_neighbors].value_counts().to_frame().T.assign(current_niche = current_niche))

    res_df = pd.concat(res).groupby("current_niche").agg("mean").assign(sample = sample)
    res2.append(res_df)

neighbors_per_sample = pd.concat(res2).reset_index(drop=False).set_index(["current_niche", "sample"]).pipe(lambda x: x.div(np.sum(x, axis=1), axis=0))

This results in something like this (each column a sample): image

Statistics between groups of samples can be computed using a standard linear model.

Visualization idea: image (but instead color heatmap by enriched/depleted)

grst avatar Dec 15 '23 12:12 grst

I'd like to CC @AnnaChristina here who's also been working on this AFAIK.

Zethson avatar Dec 15 '23 12:12 Zethson

@timtreis, here is one for "spatial perturbation analysis" already

grst avatar Dec 20 '23 13:12 grst

@grst we're discussing this now and might get back with an update here

Zethson avatar Jan 08 '24 14:01 Zethson

@Zethson we discussed this before going on holiday, we have a zulip channel for that, I'll add you

giovp avatar Jan 08 '24 14:01 giovp

cellcharter already has a function for differential neighborhood enrichment that does statistics on the sample level: https://github.com/CSOgroup/cellcharter/blob/396c415f706e46ce5d9b82df062d0b6509aa526e/src/cellcharter/gr/_nhood.py#L136

grst avatar May 03 '24 06:05 grst