bonito icon indicating copy to clipboard operation
bonito copied to clipboard

high number of unmapped reads in bonito bam output

Open AlineMuyle opened this issue 1 year ago • 0 comments

Hi, I am running Bonito with the following command line to study 5mC DNA methylation in the three contexts (CG, CHG and CHH):

bonito basecaller [email protected] fast5
--modified-bases 5mC
--reference reference.fasta
--modified-base-model 5mC_all_context_sup_r1041_e82 > basecalls_with_mods.bam
--alignment-threads 10

I am surprised because my sorted bam file has a size of 98.9 Gb, but 96.7% of reads are unmapped (-f 4 tag): samtools view -c -f 4 basecalls_with_mods.bam 5142276 samtools view -c -F 4 basecalls_with_mods.bam 174033

This is really annoying because when I use modbam2bed, only mapped reads are considered and so I have very low coverage to infer 5mC...

I am using a genome reference assembled from the same ONT data and it is quite complete.

Thank you for your help.

AlineMuyle avatar Oct 04 '22 14:10 AlineMuyle