deepTools
deepTools copied to clipboard
multiBamSummary bin coordinate shift
Hi, I'm using multiBamSummary to count read number of each 1kb window in my samples with deeptools 3.3.1.
My code:
multiBamSummary bins
--bamfiles /public1/home/samples/*bam
--binSize 1000 -p 4
-out readCounts.npz --outRawCounts readCounts.tab --scalingFactors scaleFactor.tab
The bin coordinates in readCount.tab are like this: chr2L 59000 60000 chr2L 60000 61000 chr2L 61000 62000 chr2L 62901 63901 chr2L 63901 64901
Why the bin coordinates shifted after 62000? Is there a way to keep bin coordinates consecutive? Thanks
That seems quite odd, I'll have to see if I can reproduce this. Can you post the BAM files somewhere? That will help me in tracking this down.
@dpryan79 This folder contains 28 bam files so it could be hard to upload all of them. Let me see if I can reproduce this with less files. I'll let you know later. Thanks.
Hi,
I do have the same problem with few BAM files. In my case I can reproduce the error with one unique BAM file.
multiBamSummary version: 3.5.1
command: multiBamSummary bins -b INPUT.bam --labels LABEL -o readCounts_OUTPUT.npz --outRawCounts readCounts_OUTPUT.tab -p 8
(the same shift observerd with --centerReads
option on/off)
after sorting if I search for the positions where the shift happens, this is what I get:
1 154540000 154550000 134.0
1 154550000 154552331 32.0
1 154552331 154562331 66.0
1 154562331 154572331 58.0
--
2 154540000 154550000 92.0
2 154550000 154552331 17.0
2 154552331 154562331 142.0
2 154562331 154572331 75.0
--
3 154540000 154550000 288.0
3 154550000 154552331 8.0
3 154552331 154562331 99.0
3 154562331 154572331 164.0
--
4 154540000 154550000 139.0
4 154550000 154552331 21.0
4 154552331 154562331 162.0
4 154562331 154572331 308.0
I tried subsampling the bam (tried over 500 different 10% subsamples) and was not able to reproduce it. I also tried extracting a single chromosome from the bam but then the shift disappears.
The only other hint I have is that, if I remove reads whitout an equal sign in it (grep '\t=\t') and I run multiBamSummary, than I have a different shift:
1 162130000 162140000 102.0
1 162140000 162143985 93.0
1 162143985 162153985 103.0
1 162153985 162163985 122.0
Note that if I do the binning with bamCoverage instead, there is no shift.