scanpy icon indicating copy to clipboard operation
scanpy copied to clipboard

using `adata.var_names` in `sc.get.obs_df`

Open zoepiran opened this issue 2 years ago • 4 comments

Calling sc.get.obs_df with keys=adata.var_names raises a value error since here an Index object is used to re-order the dataframe. I think its a trivial fix to re-cast keys locally if an Index object is passed (aka adata.var_names) and could be nice for the user :)

zoepiran avatar May 16 '22 12:05 zoepiran

I am also currently trying to figure out how to get around this. I am getting this error when calling sc.pl.rank_genes_groups_violin using a "gene_symbols" parameter (this function works when that parameter is omitted but the x-axis labels are not what I desire). In my case, adata.uns["rank_genes_groups"]["names"] content is set to adata.var_names (where the index column is separate from what is being passed to the "gene_symbols" parameter outlined earlier). I am running anndata==0.7.8 and scanpy==1.8.2 so this issue has been around for a couple release versions.

adkinsrs avatar May 19 '22 14:05 adkinsrs

try passing to gene_symbol a list object of the var_names you want to plot

zoepiran avatar May 19 '22 14:05 zoepiran

try passing to gene_symbol a list object of the var_names you want to plot

The "gene_symbols" param for rank_gene_groups_violin() takes a key from adata.var though. The "gene_names" field allows for a list object, but this is for specific genes rather than the top N genes from rank_genes_groups.

https://scanpy.readthedocs.io/en/stable/generated/scanpy.pl.rank_genes_groups_violin.html#scanpy.pl.rank_genes_groups_violin

I guess I could just map the top N genes from rank_genes_groups on my own and pass that to the "gene_names" parameter. However, I feel that the function (either rank_genes_groups_violin setting "_gene_names" or sc.get.obs_df renaming the "keys" based on the "gene_symbols" param) should handle the adata.uns["rank_genes_groups"]["names"] mapping to adata.var[<gene_symbols_key>] behind the scenes. At least the "gene_symbols" param definition for rank_genes_groups_violin implies that it does.

adkinsrs avatar May 19 '22 15:05 adkinsrs

I think my issue is ever-so-slightly different enough that I am going to create a new ticket instead. (#2258)

adkinsrs avatar May 19 '22 15:05 adkinsrs

I figure this out, please change use_raw=FALSE to

df = sc.get.obs_df(adata, genes + ['leiden','sample'], use_raw=True)

amoyguang1 avatar Sep 30 '22 10:09 amoyguang1