deepTools icon indicating copy to clipboard operation
deepTools copied to clipboard

multiBamSummary bin coordinate shift

Open xinl22 opened this issue 4 years ago • 3 comments

Hi, I'm using multiBamSummary to count read number of each 1kb window in my samples with deeptools 3.3.1.

My code: multiBamSummary bins
--bamfiles /public1/home/samples/*bam
--binSize 1000 -p 4
-out readCounts.npz --outRawCounts readCounts.tab --scalingFactors scaleFactor.tab

The bin coordinates in readCount.tab are like this: chr2L 59000 60000 chr2L 60000 61000 chr2L 61000 62000 chr2L 62901 63901 chr2L 63901 64901

Why the bin coordinates shifted after 62000? Is there a way to keep bin coordinates consecutive? Thanks

xinl22 avatar Dec 17 '20 07:12 xinl22

That seems quite odd, I'll have to see if I can reproduce this. Can you post the BAM files somewhere? That will help me in tracking this down.

dpryan79 avatar Dec 18 '20 09:12 dpryan79

@dpryan79 This folder contains 28 bam files so it could be hard to upload all of them. Let me see if I can reproduce this with less files. I'll let you know later. Thanks.

xinl22 avatar Dec 19 '20 18:12 xinl22

Hi, I do have the same problem with few BAM files. In my case I can reproduce the error with one unique BAM file. multiBamSummary version: 3.5.1 command: multiBamSummary bins -b INPUT.bam --labels LABEL -o readCounts_OUTPUT.npz --outRawCounts readCounts_OUTPUT.tab -p 8 (the same shift observerd with --centerReads option on/off)

after sorting if I search for the positions where the shift happens, this is what I get:

1       154540000       154550000       134.0
1       154550000       154552331       32.0
1       154552331       154562331       66.0
1       154562331       154572331       58.0
--
2       154540000       154550000       92.0
2       154550000       154552331       17.0
2       154552331       154562331       142.0
2       154562331       154572331       75.0
--
3       154540000       154550000       288.0
3       154550000       154552331       8.0
3       154552331       154562331       99.0
3       154562331       154572331       164.0
--
4       154540000       154550000       139.0
4       154550000       154552331       21.0
4       154552331       154562331       162.0
4       154562331       154572331       308.0

I tried subsampling the bam (tried over 500 different 10% subsamples) and was not able to reproduce it. I also tried extracting a single chromosome from the bam but then the shift disappears.

The only other hint I have is that, if I remove reads whitout an equal sign in it (grep '\t=\t') and I run multiBamSummary, than I have a different shift:

1       162130000       162140000       102.0
1       162140000       162143985       93.0
1       162143985       162153985       103.0
1       162153985       162163985       122.0

Note that if I do the binning with bamCoverage instead, there is no shift.

fransua avatar Oct 28 '21 09:10 fransua