scanpy icon indicating copy to clipboard operation
scanpy copied to clipboard

Support obsm key to color UMAP

Open picciama opened this issue 4 years ago • 1 comments

  • [x] Additional function parameters / changed functionality / changed defaults?
  • [ ] New analysis tool: A simple analysis tool you have been using and are missing in sc.tools?
  • [ ] New plotting function: A kind of plot you would like to seein sc.pl?
  • [ ] External tools: Do you know an existing package that should go into sc.external.*?
  • [ ] Other?

Quite often I need to color UMAPs based on features that are not part of adata.X but adata.obsm for the reason that they are special. E.g. KO data with gRNAs versus endogenes/ target genes, or viral genes versus edogenes.

Example use case:

  • Cluster cells based on endogenes
  • UMAP and color by a bunch of viral genes

Clustering must not include these viral genes -> must be excluded from X. I don't want to store so many additional columns in obs and I need to have these features separated in their own matrix for downstream analysis, which is why I want to use obsm.

Can we have sth. like this:

sc.pl.umap(adata, color='viral_genes')  # adata.obsm['viral_genes'] is a pandas.DataFrame ?

It shouldn't be overcomplicated I think, since this only involves an additional check: if the elements in the color arg list are not found in obs.columns nor var.columns, then check the keys in obsm and use the entire dataframe behind this key.

picciama avatar Nov 18 '20 20:11 picciama

This has been worked on here: https://github.com/theislab/anndata/pull/342

The idea is to allow any vector from the anndata object to be used for coloring, but that PR seems a bit stalled at the moment. This would also be useful for providing parameters in other places, like regress_out.

ivirshup avatar Nov 19 '20 05:11 ivirshup