ArchR icon indicating copy to clipboard operation
ArchR copied to clipboard

save macs2 fold enrichment signal value in summits

Open badoi opened this issue 2 years ago • 5 comments

Currently, the narrowPeak signal value at the summit isn't saved, and this quick fix allows this metric to be added to the replicate summits .rds file for downstream uses.

badoi avatar Jun 25 '22 17:06 badoi

I finished trying this version of the peak calling + reproducible peak clustering and completed without any errors. Shouldn't change underlying peak calling by grabbing the replicate summits from narrowPeakFile from macs2 and storing the extra signalValue column.

badoi avatar Jun 26 '22 17:06 badoi

Thanks for this suggestion. I made your commits on a different branch dev_narrowPeak which branches from release_1.0.2 instead of master just to maintain consistency. The one thing I wasnt able to confirm in the MACS2 docs is that the summits score is 1/10th of narrowPeak score. But I will test this on the tutorial data to confirm.

rcorces avatar Jun 30 '22 13:06 rcorces

That makes sense. The https://github.com/macs3-project/MACS/blob/master/docs/callpeak.md#output-files says they should be the same, but on my files, I found the scores in summits.bed and peaks.narrowPeak are off by a factor of 10. Yes please check, and thanks for folding this into the newest update of ArchR when it comes out.

badoi avatar Jun 30 '22 14:06 badoi

@badoi - The scores arent precisely the same in my hands (same approximate values, different decimal precision). But that doesnt appear to affect the downstream reproducible peak set in any noticeable way. Can you describe the downstream uses that this change enables just so that I can contextualize it?

rcorces avatar Jul 06 '22 21:07 rcorces

Thanks Ryan! I'm working on identifying which metrics of open chromatin that are comparable across bulk ATAC, pseudo-bulk ATAC, and single-cell aggregated peak accessibility--especially for the ML sequence prediction models. The peakSignalValue is locally normalized that might be better than the aggregated peak accessibility matrix, so I'm adding the patch to see if that's true now that I have those metrics from pseudo-bulk peak calls. For the typical user, they might not even notice this metric is there.

badoi avatar Jul 07 '22 02:07 badoi