scanpy Small multiple plots for clusters

[ ] Additional function parameters / changed functionality / changed defaults?
[ ] New analysis tool: A simple analysis tool you have been using and are missing in sc.tools?
[x] New plotting function: A kind of plot you would like to seein sc.pl?
[ ] External tools: Do you know an existing package that should go into sc.external.*?
[ ] Other?

Hey @fidelram !

I just wrote something to create small multiples to plot cells in a clustering category. Pretty simple and very useful if you have too many clusters. What do you think of this:

def cluster_small_multiples(adata, clust_key, size=60, frameon=False, legend_loc=None, **kwargs):
    tmp = adata.copy()

    for i,clust in enumerate(adata.obs[clust_key].cat.categories):
        tmp.obs[clust] = adata.obs[clust_key].isin([clust]).astype('category')
        tmp.uns[clust+'_colors'] = ['#d3d3d3', adata.uns[clust_key+'_colors'][i]]

    sc.pl.umap(tmp, groups=tmp.obs[clust].cat.categories[1:].values, color=adata.obs[clust_key].cat.categories.tolist(), size=size, frameon=frameon, legend_loc=legend_loc, **kwargs)

Example output from:

test = sc.datasets.pbmc68k_reduced()
sc.pp.pca(test)
sc.pp.neighbors(test)
sc.tl.umap(test)
cluster_small_multiples(test, 'bulk_labels')

umap_bulk_lab_sm

Could generalize this to different bases via sc.pl.scatter(). Or is this already implemented somewhere that I'm not aware of? Or maybe it's too simple to have as a small helper function?

Dec 16 '19 18:12 LuckyMD

I need that a lot actually. One question, is it different than just sc.pl.umap(adata, groups=adata.bulk_labels.cat.categories)?

Maybe a special keyword like groups='all' would be easy to use on the API side?

Dec 17 '19 04:12 gokceneraslan

Tbh, I found out about groups after writing the function and looking for a way to put the dots in front. Maybe there is a simpler way to do this...

But then the command you suggest gives an error on my own data if I don't also specify color='bulk_labels' (works for the pbmc68k, but doesn't colour anything in), and then it just puts all the labels on the same plot and doesn't create small multiples.

Dec 17 '19 10:12 LuckyMD

Just saw this by chance. As we're planning to merge with scvelo's plotting modules soonish, that would simply become scv.pl.scatter(adata, groups=[[c] for c in adata.obs['clusters'].cat.categories], color='clusters', ncols=4), simply passing a list of lists to groups (without copying a whole anndata object). Will that be sufficient?

Dec 17 '19 17:12 VolkerBergen

Damn... so smug that @VolkerBergen ;).

This looks great! Although maybe a keyword that triggers this might be nice for the user.

Dec 17 '19 17:12 LuckyMD

Any keyword suggestion? groups='all' as @gokceneraslan suggested?

Dec 17 '19 18:12 VolkerBergen

@LuckyMD thanks for the suggestion. I actually wrote some code long time ago that does something similar and that I use quite frequently. The main difference is that I always use the same color for all the clusters as sometimes I can not distinguish between the gray background and the cluster color. But using what @VolkerBergen suggested and then resetting the colors should do the trick.

@VolkerBergen I second the idea of groups='all' as I think this also exists for sc.pl.embedding_density

Dec 17 '19 20:12 fidelram

I actually used sc.pl.embedding_density() for this in the end, as it works quite well for this use case as well.

Dec 17 '19 20:12 LuckyMD

Tried with the group option but got an Value error: The truth value of a Index is ambiguous. As I didn't know how to deal with it I just applied the function @LuckyMD posted above and it worked perfectly alright.

Aug 12 '21 12:08 kristianunger

How are the things going now ?

Sep 28 '21 09:09 wangjiawen2013

Hi there - random question, but for some reason, after applying this plotting solution, it specifically then leads to an error being thrown when the adata object is written (I double checked, and this doesn't happen if this scv.pl.scatter command isn't called).

--> 103 write_elem(f, "obs", adata.obs, dataset_kwargs=dataset_kwargs)

TypeError: Can't implicitly convert non-string objects to strings

Is something stored in the object that is specific to this here, that can lead to an AnnData write error? The issue relates to the .obs column, and I can certainly save the adata object if not running this plotting command

I also checked the dtypes of the obs columns, and there doesn't seem to be anything out of the ordinary there either

Any help would be appreciated, it took me some time to figure out this was causing the issue! (and it's a bit frustrating to not be able to save an object just from running a plot command)

Dec 19 '23 07:12 vkartha

scanpy scanpy copied to clipboard

Small multiple plots for clusters

scanpy
scanpy copied to clipboard