gencore icon indicating copy to clipboard operation
gencore copied to clipboard

Behavior relative to picard?

Open caleblareau opened this issue 6 years ago • 1 comments

Hi, I separately tried running gencore on a paired-end sequencing run

gencore -i mix_2lines_PE.bam -o gencore_out.bam -r $fasta -s 1

The input .bam file is 738MB. The output bam from picard is 395MB whereas the output from gencore is only 66MB. The data is a relatively low-complexity targeted sequencing sample, but this head-to-head test indicated that gencore is filtering much more than picard. Indeed, the picard bam has about 6 times as many reads when you count compared to the gencore output bam. Any help would be greatly appreciated.

caleblareau avatar Nov 05 '19 18:11 caleblareau

You have to post the code of Picard. Is it that Picard just marked the dup and did not remove it? This gencore directly lost the sequence of the dup, of course, the bam size is different

zhujiaqi2014 avatar Nov 28 '19 06:11 zhujiaqi2014