ArchR
ArchR copied to clipboard
save macs2 fold enrichment signal value in summits
Currently, the narrowPeak signal value at the summit isn't saved, and this quick fix allows this metric to be added to the replicate summits .rds
file for downstream uses.
I finished trying this version of the peak calling + reproducible peak clustering and completed without any errors. Shouldn't change underlying peak calling by grabbing the replicate summits from narrowPeakFile
from macs2 and storing the extra signalValue
column.
Thanks for this suggestion.
I made your commits on a different branch dev_narrowPeak
which branches from release_1.0.2
instead of master
just to maintain consistency. The one thing I wasnt able to confirm in the MACS2 docs is that the summits score is 1/10th of narrowPeak score. But I will test this on the tutorial data to confirm.
That makes sense. The https://github.com/macs3-project/MACS/blob/master/docs/callpeak.md#output-files says they should be the same, but on my files, I found the scores in summits.bed
and peaks.narrowPeak
are off by a factor of 10. Yes please check, and thanks for folding this into the newest update of ArchR when it comes out.
@badoi - The scores arent precisely the same in my hands (same approximate values, different decimal precision). But that doesnt appear to affect the downstream reproducible peak set in any noticeable way. Can you describe the downstream uses that this change enables just so that I can contextualize it?
Thanks Ryan! I'm working on identifying which metrics of open chromatin that are comparable across bulk ATAC, pseudo-bulk ATAC, and single-cell aggregated peak accessibility--especially for the ML sequence prediction models. The peakSignalValue
is locally normalized that might be better than the aggregated peak accessibility matrix, so I'm adding the patch to see if that's true now that I have those metrics from pseudo-bulk peak calls. For the typical user, they might not even notice this metric is there.