spatialdata
spatialdata copied to clipboard
Improvement to aggregation (`by_key`)
In the Xenium + Visium notebook 01
This block of code, that performs an aggregation of cell types (cells) by visium circles (with fractions=True) could be simplified. Here we aggregate not into individual Visium circles, but into the area given by all Visium circles that have a categorical variable corresponding to a value (e.g. clone 1).
cell_types_categories = xe_rep1_roi_sdata.table.obs["celltype_major"].cat.categories.tolist()
rois_fractions = {}
for row in landmarks_sdata["rois"].iterrows():
name = row[1][-1]
cells_inside = cells_in_rois_sdata_rep1[f"Shapes in ROI '{name}'"]
indices_rep1 = cells_inside.index.tolist()
corresponding_rows_mask = xe_rep1_roi_sdata.table.obs["cell_id"].isin(indices_rep1)
corresponding_rows = xe_rep1_roi_sdata.table[corresponding_rows_mask]
cell_types = corresponding_rows.obs["celltype_major"]
empty = pd.Series(index=cell_types_categories, data=np.zeros(len(cell_types_categories), dtype=float))
counts = cell_types.value_counts()
empty.loc[counts.index] = counts
rois_fractions[name] = empty
df1_rois = pd.DataFrame(rois_fractions).transpose()
Now the aggregation APIs are robusts enough to considering adding a by_key parameter that groups shapes in by by the value of a categorical column of the by spatial element, and then performs the aggregation.