when n_genes is set to a value (such as 2000), and pts=True, then sc.tl.rank_genes_groups will compute the fraction of cells expressing the genes, but the output includes all the genes, not just the 2000 genes.

Oct 13 '20 03:10 wangjiawen2013

Can you elaborate? What do you mean with the output?

In the past we only computed up to 100 genes by default but now we do it for all. You can always limit the number of genes you want to see afterwards. So maybe we should remove the n_genes from the function or deprecate the parameter.

Oct 13 '20 08:10 fidelram

How can one get a DEG table with a pts column for each cluster? So that for each group there would be 4 columns: 'names', 'logfoldchanges', 'pvals_adj' and 'pts'?

Manual sorting from 2 files is not quite optimal:

sc.tl.rank_genes_groups(adata, 'cell_types', method='wilcoxon', pts=True)
sc.pl.rank_genes_groups(adata, n_genes=25, sharey=False)
result = adata.uns['rank_genes_groups']
groups = result['names'].dtype.names
degs_by_cluster = pd.DataFrame({group + '_' + key[:14]: result[key][group]
    for group in groups for key in ['names', 'logfoldchanges', 'pvals_adj']})
degs_by_cluster.to_csv("DEG_adata_cell_types_pct_to_sort.csv")
pts=pd.DataFrame(adata.uns['rank_genes_groups']['pts'])
pts.to_csv("pts_adata.csv")

Could you help with a more efficient way to do that? @fidelram @ivirshup

Mar 14 '22 09:03 ilcink

Hello I am also facing the same problem. I would like to get gene name, log fold change, pval_adj, pts.pts_rest in a single output CSV file but i couldn't able to do that sc.tl.rank_genes_groups(adata,"leiden_0.6", method='t-test',pts=True,corr_method='benjamini-hochberg') pd.DataFrame(adata.uns['rank_genes_groups']['names']) result = adata.uns['rank_genes_groups'] groups = result['names'].dtype.names df= pd.DataFrame( {group + '_' + key[:1]: result[key][group] for group in groups for key in ['names','logfoldchanges','pts','pts_rest','pvals','pvals_adj']}) df.to_csv("/home/Akila/integration/harmony/subset/celltype/find_markergenes.csv")

Any idea how to get in the single file along with pts??

Thanks Akila

Jun 23 '22 20:06 AkilaRanjith

Try the following code:

Differential expression and marker genes

result = adata.uns['rank_genes_groups'] groups = result['names'].dtype.names df1 = pd.DataFrame({group+'' + key:result[key][group] for group in groups for key in ['names','scores','logfoldchanges','pvals','pvals_adj']}) df2 = pd.DataFrame({group+'' + key:result[key][group] for group in groups for key in ['pts','pts_rest']}) pd.concat([df1[[group+'_names',group+'_scores',group+'_logfoldchanges',group+'_pvals',group+'_pvals_adj']].merge(df2[[group+"_pts",group+"_pts_rest"]],how="left",left_on=group+"_names",right_index=True) for group in groups],axis=1).to_csv("markers.csv")

Aug 24 '22 08:08 wangjiawen2013

scanpy
scanpy copied to clipboard

sc.tl.rank_gens_groups pts

Differential expression and marker genes

scanpy scanpy copied to clipboard

sc.tl.rank_gens_groups pts

Differential expression and marker genes

scanpy
scanpy copied to clipboard