scanpy
scanpy copied to clipboard
using `adata.var_names` in `sc.get.obs_df`
Calling sc.get.obs_df
with keys=adata.var_names
raises a value error since here an Index object is used to re-order the dataframe.
I think its a trivial fix to re-cast keys
locally if an Index object is passed (aka adata.var_names
) and could be nice for the user :)
I am also currently trying to figure out how to get around this. I am getting this error when calling sc.pl.rank_genes_groups_violin
using a "gene_symbols" parameter (this function works when that parameter is omitted but the x-axis labels are not what I desire). In my case, adata.uns["rank_genes_groups"]["names"]
content is set to adata.var_names
(where the index column is separate from what is being passed to the "gene_symbols" parameter outlined earlier). I am running anndata==0.7.8 and scanpy==1.8.2 so this issue has been around for a couple release versions.
try passing to gene_symbol
a list object of the var_names
you want to plot
try passing to
gene_symbol
a list object of thevar_names
you want to plot
The "gene_symbols" param for rank_gene_groups_violin()
takes a key from adata.var
though. The "gene_names" field allows for a list object, but this is for specific genes rather than the top N genes from rank_genes_groups
.
https://scanpy.readthedocs.io/en/stable/generated/scanpy.pl.rank_genes_groups_violin.html#scanpy.pl.rank_genes_groups_violin
I guess I could just map the top N genes from rank_genes_groups
on my own and pass that to the "gene_names" parameter. However, I feel that the function (either rank_genes_groups_violin
setting "_gene_names" or sc.get.obs_df
renaming the "keys" based on the "gene_symbols" param) should handle the adata.uns["rank_genes_groups"]["names"]
mapping to adata.var[<gene_symbols_key>]
behind the scenes. At least the "gene_symbols" param definition for rank_genes_groups_violin
implies that it does.
I think my issue is ever-so-slightly different enough that I am going to create a new ticket instead. (#2258)
I figure this out, please change use_raw=FALSE to
df = sc.get.obs_df(adata, genes + ['leiden','sample'], use_raw=True)