bowtie2 icon indicating copy to clipboard operation
bowtie2 copied to clipboard

MAPQ for uniquely mapped read in end-to-end mode

Open gamerino opened this issue 2 years ago • 1 comments

Hello!

We are trying to understand MAPQ calculations in end-to-end mode. More specifically, we are interested to know why Bowtie 2 reports MAPQ = 255 for uniquely mapped reads when multiple alignments are allowed (i.e. -k 10) instead of reporting the same MAPQ that is assigned to these reads when default is used.

This is our example:

  1. Allowing multiple alignments:
  • Code
bowtie2 -k 10 -X2000 --mm --threads 15 --very-sensitive --no-mixed --no-discordant -x ${bwt2_idx} -1 ${fastq1} -2 ${fastq2}
  • Output
ERR5753543.1    83      6       29816795        255     118M    =       29816795        -118    GTACATGGTATAGGCCCGCAGAGAATGGGCAGGTGAATGTGGACTGGCATGGTCGGAGGAACTGAAACTTTACCTGCCAGAGAAGACTTAGGGAAATGAGGTGGAGGGTGTCCTCACC  FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFF:FFFFF:F:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF  AS:i:-5 XN:i:0  XM:i:1  XO:i:0  XG:i:0  NM:i:1  MD:Z:14G103     YS:i:-5 YT:Z:CP
ERR5753543.1    163     6       29816795        255     118M    =       29816795        118     GTACATGGTATAGGCCCGCAGAGAATGGGCAGGTGAATGTGGACTGGCATGGTCGGAGGAACTGAAACTTTACCTGCCAGAGAAGACTTAGGGAAATGAGGTGGAGGGTGTCCTCACC  FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF  AS:i:-5 XN:i:0  XM:i:1  XO:i:0  XG:i:0  NM:i:1  MD:Z:14G103     YS:i:-5 YT:Z:CP
ERR5753543.2    83      4       1269269 255     124M1I25M       =       1269243 -175    GAGGCGCCCGGGGGACGGGGGGTGGGTGCGGGAACCGGTGGGGGCGCGGGACAGGCCGGGACAGGGACAGGGCCCCGGACGCGAGCCCCTGGCGCCCCCACCCGCACGTCTCCCGCCGCCCTCGCCCCCCGCGCGCCCCCGCCCCAGCTC  F,FF,,F:,:FFF:F,,FF,FFFFF:,F,FFFFFF,FF,FFFFF:F:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF  AS:i:-29        XN:i:0  XM:i:6  XO:i:1  XG:i:1  NM:i:7  MD:Z:9C6C2T2A3A11A110   YS:i:-35        YT:Z:CP
ERR5753543.2    163     4       1269243 255     150M    =       1269269 175     AGCCTGGGTAGGGGGAGGCGGAGGGGGAGGCGCCCCGGGGACCGGTGGAGGGAGCGGGAACCGGAGGGGGCGCGGGACAGGCCGGGACAGGGACAGGGCCCCGGACGCGAGCCCCTGGCGCCCCCCCCCCCACCGCGACCCCCCCCCCCC  FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,:FFFF,FFFFF:F:,,,,FF,FFFFFF,F,  AS:i:-35        XN:i:0  XM:i:10 XO:i:0  XG:i:0  NM:i:10 MD:Z:125A3G3G0T1T0C2G2G3T1G0    YS:i:-29        YT:Z:CP 
  1. Reporting only the best alignment:
  • Code
bowtie2  -X2000 --mm --threads 15 --very-sensitive --no-mixed --no-discordant -x ${bwt2_idx} -1 ${fastq1} -2 ${fastq2}
  • Output
ERR5753543.1    83      6       29816795        42      118M    =       29816795        -118    GTACATGGTATAGGCCCGCAGAGAATGGGCAGGTGAATGTGGACTGGCATGGTCGGAGGAACTGAAACTTTACCTGCCAGAGAAGACTTAGGGAAATGAGGTGGAGGGTGTCCTCACC  FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFF:FFFFF:F:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF  AS:i:-5 XN:i:0  XM:i:1  XO:i:0  XG:i:0  NM:i:1  MD:Z:14G103     YS:i:-5 YT:Z:CP
ERR5753543.1    163     6       29816795        42      118M    =       29816795        118     GTACATGGTATAGGCCCGCAGAGAATGGGCAGGTGAATGTGGACTGGCATGGTCGGAGGAACTGAAACTTTACCTGCCAGAGAAGACTTAGGGAAATGAGGTGGAGGGTGTCCTCACC  FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF  AS:i:-5 XN:i:0  XM:i:1  XO:i:0  XG:i:0  NM:i:1  MD:Z:14G103     YS:i:-5 YT:Z:CP
ERR5753543.2    83      4       1269269 24      124M1I25M       =       1269243 -175    GAGGCGCCCGGGGGACGGGGGGTGGGTGCGGGAACCGGTGGGGGCGCGGGACAGGCCGGGACAGGGACAGGGCCCCGGACGCGAGCCCCTGGCGCCCCCACCCGCACGTCTCCCGCCGCCCTCGCCCCCCGCGCGCCCCCGCCCCAGCTC  F,FF,,F:,:FFF:F,,FF,FFFFF:,F,FFFFFF,FF,FFFFF:F:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF  AS:i:-29        XN:i:0  XM:i:6  XO:i:1  XG:i:1  NM:i:7  MD:Z:9C6C2T2A3A11A110   YS:i:-35        YT:Z:CP
ERR5753543.2    163     4       1269243 24      150M    =       1269269 175     AGCCTGGGTAGGGGGAGGCGGAGGGGGAGGCGCCCCGGGGACCGGTGGAGGGAGCGGGAACCGGAGGGGGCGCGGGACAGGCCGGGACAGGGACAGGGCCCCGGACGCGAGCCCCTGGCGCCCCCCCCCCCACCGCGACCCCCCCCCCCC  FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,:FFFF,FFFFF:F:,,,,FF,FFFFFF,F,  AS:i:-35        XN:i:0  XM:i:10 XO:i:0  XG:i:0  NM:i:10 MD:Z:125A3G3G0T1T0C2G2G3T1G0    YS:i:-29        YT:Z:CP

Thank you.

gamerino avatar Sep 16 '22 18:09 gamerino

Dear all, I have encountered a similar issue. I'm using version 2.4.1 and I observed that in my resulting sam file I have many primary alignments with an assigned MAPQ of 255. It is paired-end RNAseq data and I used the following settings: --very-sensitive -X 1000 -k 10 --phred33. Reading other threads, I have seen that the maximum MAPQ should be 42 for global alignments, but maybe it is in a posterior version, such as 2.4.3, which included a fix for MAPQ calculations. I would be very happy to receive some feedback. Kind regards, Natalia

natalia-rodilla avatar Oct 13 '23 07:10 natalia-rodilla