Bismark icon indicating copy to clipboard operation
Bismark copied to clipboard

Minor size differences in output bam file

Open chasingdex opened this issue 2 years ago • 1 comments

Hi @FelixKrueger I have three bam outputs generated with identical parameters and identical versions of Bismark and bowtie2 using a set of paired end trimmed inputs. Three parallel parameters were used- 8, 12 and 16. The total no. of reads aligned to the genome according to their respective alignment reports are the same.

However, the bam files have a size difference. See below. What could cause these size differences?

  • Parallel thread; Bam size
  • 16; 86370183058
  • 12; 86250282377
  • 8; 86111223110

chasingdex avatar Dec 06 '22 23:12 chasingdex

This. is probably a question better directed towards some compression afficionados - data simply compresses slightly different depending on how you do it, or in your case - which reads come first. I have seen size differences of nearly 2-fold with (read name) sorted BAM files for the same information content...

FelixKrueger avatar Dec 07 '22 00:12 FelixKrueger