harmony
harmony copied to clipboard
Corrected differential gene expression matrix
I've been using Harmony for batch effect integration, but I would like to be able to continue downstream analysis in Seurat using the corrected data. I read this review rating Harmony's batch effect correction quite high, but they were not able to use it for differentially expressed gene analysis. Does Harmony calculate a corrected expression matrix? Is it possible to calculate it ourselves?
Hi, not the developers, but harmony does not calculate a corrected expression matrix for genes. You can try using the package SC Merge like recommended in that review!
Dear @nshedd and @cswoboda,
This is an active problem that we are pursuing and hope to have something out in the next version of Harmony. If you're interested, please let me know if you'd like to be a beta-tester!
Best, Ilya
Hi, I was wondering about this too. I thought it is actually recommend to use the non-corrected values for DE analysis (https://github.com/satijalab/seurat/discussions/4000). Could you advise on this?
@JJBio Harmony corrects the PCA in order to perform integration. The underlying gene expression values should be "non-corrected" as harmony only uses them to produce clusters and the integrated UMAP, so the underlying values in the RNA and SCT Assay should be uncorrected.
@ilyakorsunsky I would be happy to be a beta tester, so sorry I'm responding to this late!
@cswoboda thanks for your quick answer! I am aware that harmony only corrects the PCAs and does not produce a corrected matrix. I was just wondering why it was mentioned to use the corrected matrix for DE / downstream analysis, as I thought this was not recommended anyways. For which downstream applications is the corrected expression matrix recommended?
@JJBio Sorry I may be a bit confused, are you referring to the review posted? In terms of what I'm aware of as standard practice within the field, the Seurat thread you posted is the right way to go. I think the review, as well as seurat, implies that hypothetically if you had a robust method to correct batch effects across your samples, obviously that batch corrected counts matrix would be best for DEG analysis. But it's not readily available within the field at the moment, so the data transformation required to perform integration in both seurat and harmony lead to the corrected count matrices as not fair comparisons between datasets. SCTransform is an example of this, because of the way it's normalization method and scaling works intra-dataset it doesn't really produce comparable assays between objects. A corrected expression matrix for DEGs would be recommended for any downstream applications potentially if it was robustly transformed, more accurately captured the biology, etc. But that review mostly focuses on the ability of those methods on batch correction for dimensionality reduction. Hope that makes sense and that I'm answering your question!
Thanks for your helpful answer!
Dear @nshedd and @cswoboda,
This is an active problem that we are pursuing and hope to have something out in the next version of Harmony. If you're interested, please let me know if you'd like to be a beta-tester!
Best, Ilya
Dear Ilya, Is it possible to consider me as a beta tester too?
Any updates on this issue? I'd also like to get batch corrected expression values but don't see anything in the devel branch.
Same!
@ilyakorsunsky Hi, I am wondering if there is any update on this issue. We would like to use integrated data by Harmony and we do not have any way for that for now. Thank you.
I think scAlign is our best bet : https://rnabioco.github.io/cellar/previous/2019/docs/6_alignment.html