fgbio icon indicating copy to clipboard operation
fgbio copied to clipboard

ClipBam not clipping overlapping reads

Open blackbeerd opened this issue 2 years ago • 4 comments

Hi fgbio,

I am running ClipBam at the end of my workflow to remove overlapping reads (command and screenshot below) and when I look at the resulting bam in IGV, I still see overlapping read pairs . Am I using it wrong? When I run this file through bamutils clipOverlap it seems to work right. Any thoughts?

java -Xmx1g -jar fgbio.jar ClipBam --clipping-mode=Hard --clip-overlapping-reads=T --input=my.bam --output=my.clipped.bam --auto-clip-attributes=True --ref=hg38.fasta --metrics=my.clipped.metrics.txt

Top=ClipBam output Bottom=bamutil output (first two reads with "A" alt are paired)

image

blackbeerd avatar Mar 10 '22 00:03 blackbeerd

@blackbeerd It's hard to tell from that picture. Assuming that those reads are mate pairs, then yes this sounds like a bug. Would you be able to attach two BAMs, each with just a single read-pair, before and after running ClipBam, that exhibit the problem please?

tfenne avatar Mar 10 '22 12:03 tfenne

@tfenne Here are those reads pre and post ClipBam (these are uncompressed sams - wouldn't let me upload bams). Let me know if you need a different format. Thanks for taking a look!

pre-ClipBam.txt post-ClipBam.txt

blackbeerd avatar Mar 10 '22 19:03 blackbeerd

@blackbeerd the input reads are both mapped to the reverse strand, so unfortunately these are not FR pairs

ClipBam says:

Clipping overlapping reads is only performed on FR read pairs

I think there could be a discussion about if we want to loosen the FR orientation requirement for ClipBam.

nh13 avatar Mar 10 '22 19:03 nh13

Ahhh - thanks! I thought I had read through the tool description, I must have missed that. What's the thinking behind only clipping FR orientation, why not clip these RR overlaps as well?

blackbeerd avatar Mar 10 '22 19:03 blackbeerd