harmony icon indicating copy to clipboard operation
harmony copied to clipboard

Corrected differential gene expression matrix

Open nshedd opened this issue 3 years ago • 12 comments

I've been using Harmony for batch effect integration, but I would like to be able to continue downstream analysis in Seurat using the corrected data. I read this review rating Harmony's batch effect correction quite high, but they were not able to use it for differentially expressed gene analysis. Does Harmony calculate a corrected expression matrix? Is it possible to calculate it ourselves?

nshedd avatar Mar 19 '21 13:03 nshedd

Hi, not the developers, but harmony does not calculate a corrected expression matrix for genes. You can try using the package SC Merge like recommended in that review!

cswoboda avatar May 12 '21 21:05 cswoboda

Dear @nshedd and @cswoboda,

This is an active problem that we are pursuing and hope to have something out in the next version of Harmony. If you're interested, please let me know if you'd like to be a beta-tester!

Best, Ilya

ilyakorsunsky avatar Jun 17 '21 15:06 ilyakorsunsky

Hi, I was wondering about this too. I thought it is actually recommend to use the non-corrected values for DE analysis (https://github.com/satijalab/seurat/discussions/4000). Could you advise on this?

JJBio avatar Nov 08 '21 18:11 JJBio

@JJBio Harmony corrects the PCA in order to perform integration. The underlying gene expression values should be "non-corrected" as harmony only uses them to produce clusters and the integrated UMAP, so the underlying values in the RNA and SCT Assay should be uncorrected.

@ilyakorsunsky I would be happy to be a beta tester, so sorry I'm responding to this late!

cswoboda avatar Nov 08 '21 19:11 cswoboda

@cswoboda thanks for your quick answer! I am aware that harmony only corrects the PCAs and does not produce a corrected matrix. I was just wondering why it was mentioned to use the corrected matrix for DE / downstream analysis, as I thought this was not recommended anyways. For which downstream applications is the corrected expression matrix recommended?

JJBio avatar Nov 09 '21 11:11 JJBio

@JJBio Sorry I may be a bit confused, are you referring to the review posted? In terms of what I'm aware of as standard practice within the field, the Seurat thread you posted is the right way to go. I think the review, as well as seurat, implies that hypothetically if you had a robust method to correct batch effects across your samples, obviously that batch corrected counts matrix would be best for DEG analysis. But it's not readily available within the field at the moment, so the data transformation required to perform integration in both seurat and harmony lead to the corrected count matrices as not fair comparisons between datasets. SCTransform is an example of this, because of the way it's normalization method and scaling works intra-dataset it doesn't really produce comparable assays between objects. A corrected expression matrix for DEGs would be recommended for any downstream applications potentially if it was robustly transformed, more accurately captured the biology, etc. But that review mostly focuses on the ability of those methods on batch correction for dimensionality reduction. Hope that makes sense and that I'm answering your question!

cswoboda avatar Nov 09 '21 14:11 cswoboda

Thanks for your helpful answer!

JJBio avatar Nov 16 '21 13:11 JJBio

Dear @nshedd and @cswoboda,

This is an active problem that we are pursuing and hope to have something out in the next version of Harmony. If you're interested, please let me know if you'd like to be a beta-tester!

Best, Ilya

Dear Ilya, Is it possible to consider me as a beta tester too?

kayvanshabani avatar Jun 13 '22 15:06 kayvanshabani

Any updates on this issue? I'd also like to get batch corrected expression values but don't see anything in the devel branch.

cstubben avatar Oct 07 '22 18:10 cstubben

Same!

deepikadilip avatar Apr 25 '23 20:04 deepikadilip

@ilyakorsunsky Hi, I am wondering if there is any update on this issue. We would like to use integrated data by Harmony and we do not have any way for that for now. Thank you.

farshadf avatar Apr 27 '23 18:04 farshadf

I think scAlign is our best bet : https://rnabioco.github.io/cellar/previous/2019/docs/6_alignment.html

deepikadilip avatar Apr 27 '23 19:04 deepikadilip