scanpy
scanpy copied to clipboard
Small multiple plots for clusters
- [ ] Additional function parameters / changed functionality / changed defaults?
- [ ] New analysis tool: A simple analysis tool you have been using and are missing in
sc.tools
? - [x] New plotting function: A kind of plot you would like to seein
sc.pl
? - [ ] External tools: Do you know an existing package that should go into
sc.external.*
? - [ ] Other?
Hey @fidelram !
I just wrote something to create small multiples to plot cells in a clustering category. Pretty simple and very useful if you have too many clusters. What do you think of this:
def cluster_small_multiples(adata, clust_key, size=60, frameon=False, legend_loc=None, **kwargs):
tmp = adata.copy()
for i,clust in enumerate(adata.obs[clust_key].cat.categories):
tmp.obs[clust] = adata.obs[clust_key].isin([clust]).astype('category')
tmp.uns[clust+'_colors'] = ['#d3d3d3', adata.uns[clust_key+'_colors'][i]]
sc.pl.umap(tmp, groups=tmp.obs[clust].cat.categories[1:].values, color=adata.obs[clust_key].cat.categories.tolist(), size=size, frameon=frameon, legend_loc=legend_loc, **kwargs)
Example output from:
test = sc.datasets.pbmc68k_reduced()
sc.pp.pca(test)
sc.pp.neighbors(test)
sc.tl.umap(test)
cluster_small_multiples(test, 'bulk_labels')
Could generalize this to different bases via sc.pl.scatter()
. Or is this already implemented somewhere that I'm not aware of? Or maybe it's too simple to have as a small helper function?
I need that a lot actually. One question, is it different than just sc.pl.umap(adata, groups=adata.bulk_labels.cat.categories)
?
Maybe a special keyword like groups='all' would be easy to use on the API side?
Tbh, I found out about groups
after writing the function and looking for a way to put the dots in front. Maybe there is a simpler way to do this...
But then the command you suggest gives an error on my own data if I don't also specify color='bulk_labels'
(works for the pbmc68k, but doesn't colour anything in), and then it just puts all the labels on the same plot and doesn't create small multiples.
Just saw this by chance. As we're planning to merge with scvelo's plotting modules soonish, that would simply become
scv.pl.scatter(adata, groups=[[c] for c in adata.obs['clusters'].cat.categories], color='clusters', ncols=4)
, simply passing a list of lists to groups
(without copying a whole anndata object). Will that be sufficient?
Damn... so smug that @VolkerBergen ;).
This looks great! Although maybe a keyword that triggers this might be nice for the user.
Any keyword suggestion? groups='all'
as @gokceneraslan suggested?
@LuckyMD thanks for the suggestion. I actually wrote some code long time ago that does something similar and that I use quite frequently. The main difference is that I always use the same color for all the clusters as sometimes I can not distinguish between the gray background and the cluster color. But using what @VolkerBergen suggested and then resetting the colors should do the trick.
@VolkerBergen I second the idea of groups='all'
as I think this also exists for sc.pl.embedding_density
I actually used sc.pl.embedding_density()
for this in the end, as it works quite well for this use case as well.
Tried with the group option but got an Value error: The truth value of a Index is ambiguous.
As I didn't know how to deal with it I just applied the function @LuckyMD posted above and it worked perfectly alright.
How are the things going now ?
Hi there - random question, but for some reason, after applying this plotting solution, it specifically then leads to an error being thrown when the adata object is written (I double checked, and this doesn't happen if this scv.pl.scatter
command isn't called).
--> 103 write_elem(f, "obs", adata.obs, dataset_kwargs=dataset_kwargs)
TypeError: Can't implicitly convert non-string objects to strings
Is something stored in the object that is specific to this here, that can lead to an AnnData write error? The issue relates to the .obs column, and I can certainly save the adata object if not running this plotting command
I also checked the dtypes of the obs columns, and there doesn't seem to be anything out of the ordinary there either
Any help would be appreciated, it took me some time to figure out this was causing the issue! (and it's a bit frustrating to not be able to save an object just from running a plot command)