scanpy
scanpy copied to clipboard
rank_genes_groups logfoldchange different than seurat
Hi!
I've noticed that the function rank_genes_groups
calculate logFC differently than seurat.
https://github.com/theislab/scanpy/blob/5800db45dde0c2060ad0a9bed8ea931bd41da936/scanpy/tools/_rank_genes_groups.py#L207-L208
https://github.com/theislab/scanpy/blob/5800db45dde0c2060ad0a9bed8ea931bd41da936/scanpy/tools/_rank_genes_groups.py#L223
Thus the equation is
log(exp(mean(values))
while in Seurat is https://github.com/satijalab/seurat/blob/96d07d80bc4b6513b93e9c10d8a9d57ae7016f9f/R/differential_expression.R#L175-L179
thus
log(mean(exp(values)))
I was thus wondering if this was intended, since it leads to different logFC values.
I found this problem too. Now logFC is still calculated in this way, that I am not satisfied with. When we are talking about average fold change of gene expression, the fold change of non-loged average expression is expected. In this way people get an intuitive feeling about how many times a gene is expressed compared with another group. So the expm() step must be done before the mean() step. Swap this order not only changes the logFC vaule, but also loses the biological meaning and doesn't make any sense.