MACS icon indicating copy to clipboard operation
MACS copied to clipboard

Q: Are 2.1.2.1 or 2.1.1.20160309 results better? Some differences noted

Open RichardCorbett opened this issue 2 years ago • 1 comments

Hi folks,

We had trouble processing some data with 2.1.1.20160309 so we ran instead with 2.1.2.1. In another test we are seeing what look like significant differences in results between versions and are looking for guidance about which set of results should be reported.

We have samples with paired end reads that we processed with both versions. Here are examples of the commands used:

#2.1.1.20160309
macs2 callpeak -t A91514_2_lanes_dupsFlagged.bam -c A91520_2_lanes_dupsFlagged.bam --gsize hs -f BAMPE --name A91514_H3K4me1 --outdir out1  --broad --bdg

#2.1.2.1
macs2 callpeak -t A91514_2_lanes_dupsFlagged.bam -c A91520_2_lanes_dupsFlagged.bam --gsize hs -f BAMPE --name A91514_H3K4me1 --outdir out2  --broad --bdg

Although the peaks in the A91514_H3K4me1_peaks.xls cover similar fractions of the genome, the peaks themselves are quite disparate

metric 2.1.1.20160309 2.1.2.1
total bases in peaks 404096581 381773429
total peaks 238101 196949
peak bases unique to dataset 35769857 13487857
peaks completely unique to dataset 30261 468

First, off does it make sense to compare results this way? If so, is there reason to trust one set over the other? Which would you use?

RichardCorbett avatar Jan 11 '23 22:01 RichardCorbett

Here's in IGV screenshot of some of the differences in peaks between versions. igv_snapshot_A84624_comparison The top bam is the control bam, the second contains the ChIP reads. The 4 bed files contain either the peaks called by each version, or the subtraction between the 2 sets of peaks.

RichardCorbett avatar Jan 12 '23 17:01 RichardCorbett