minimap2
minimap2 copied to clipboard
missed first exon and incorrect alignment for two mouse genes
I am running 2.17-r941 with the parameters -ax splice -t30 -uf --secondary=no
In the first case, the following sequence is missing exon 1 (shown as soft-clipped) when aligned to mm10 the eml1 gene. BLAT on the contrary was able to map exon 1. The UCSC screenshot is here: https://www.dropbox.com/s/krx655lv4lzg0s3/Screenshot%202020-08-05%2016.16.56.png?dl=0
>eml1_exon1_unaligned
GGGCCGCGGCGGCCTCGGCAGGGCCGCCAGTGTGTGGGGTGGCGGCGCGGGCCGGAGCGGGGCGCGGGGCCCGGCCCTCAGCATGGGAGGGACGGCTTTCTCCAGCTATAGCAGCCCTGTTTACGACCCACGTCCCTCTCGCTGCTGCCAGTTTCTGCAAAACGATTGACCAGCGCGTCCTTGCCTGCCAGCAGCATGGAGGTGGTCAGACCGCCATTCGCCTCTCTGGAGCAGCGCGGTCCAGATGCAGGAGGATGACATTCAAACTGCTCAAGTCAGCCGCTGGCCGACGTGGTTTCGGGAGGCTGAAACATCACGGAGGAGCCAAAACAGGCCTGTGTGCTCAAACAGGAAAAGGGGGACCCTACCAAAAGCGAGGCCCACTGGGGGGCAGGACCCCTGCCATTTGAGAAACCACCGGTCCAAACAATGGCCAACCGTGTTACCAAAGAAACCCCAGTGCCTCCCCTCCCGGCACCCTCCGGGGGCCCAGGAAAAGAAAGTAGTTTGTGCCGGTAAACCAAAAGCATCAATAGGAACCAGCTCTTCCGAAAAGAGTGTCTCCCAGGTGGCCCGGAGGGAGAGCTCTGGGGGGACTCCAAAAGGAAGCCGGGAAACCGCACGGGCTCAACCCAGTAAGCTCCTCCAGCGGCAAGAAGAACAGTGAGAGAGCAAACCCAAGGAGCCCCGCATTTCAGTCCAGAAAGAAAGGATATGTAAAAAATGTTTTTCTTCGAGGGCCGTCCGGTCCCCACCATGTACATGCCCCAAAGGACCAAGTGGATTCGTACAGTTTTGGAAGCAAAAAAGCTGAGTTACCAAACGAAAGCGGCTGAAACTGGAGGTGGGGTCTACGGGTTACCGGGGGGCCGAGACTGTCCGCAAATAACTTTGTACTTGCCTCCCGACCGGGGGAGACCGTGGTACTTCATTGCGTCTGTCGTGGTGCTTTTACAAATGTGGAGGGAGCAGCTGCAGAGGCATTACGCGGGGCCACAAACGATGACGTTCCAAGTTGCCTAGCGGTCCATCCCTGACAGGATCCACCATAGCAACGGGGAACAAAGTGGGCGGGCCACATCTAAGGATGGAAAAGCAACCTGCCAACCACATGTGCGCATCTGGGGACTCTGTGGACACTGAAACACTCTGCATGGTCCCATTGGAATAGGCCTTTTTTTTGACCGGGCCTGTCCACCTGCATCGCATTCTCAAAGGTCTAAACGGAGGAGGCCCATCTCTGTGCTGTGGGATGATTCCAAACGAATCACGTTGCCTGTCCCGTGTGGGGATTGGCAGAAAAAGAAGAGAGACTGGGCCGACGTGGAAGTTTGTTCGAACGAAAGCGGTAATTTGCGGCGGGAACTTCCACCCCCACAGACACCAATTATCATAGTCCCCACCTGCGGGAAAAGGTCCACACATCTCTACTTTTTGGACCCTAGAAGGAAAATTTCCCTTTAACAAGAAAGCAAGGGGTTGTTTTGAGAAAACAAGAGAAGCCAAAAGTTTTGTTTCTCTGCGTGGACGTTTTTTCTGGAAAATGGCGACACCATTTACTGGAGATTTCCAAGCGGGCAACATCTTGGTGTGGGGGGAAAAGGTACAAAATCGGATAAAGCTATGCAGTTTTCAAGGGGGCCCCACGAGGGCGGGCATTTTTTTGCACTTTTGTATGCTGAGAGACGGGGGACGCTGGGTGTCCCGGAGGAGGGAAGGACCGGGCGGACCTCATTTTCCCCTGGAACGGGAAAACTATCAAAAAAACTTCACAAAGCAGAGATTCCTGAGCAGTTTTTTGGCCCCAATACGGACAGTAGCCGAAGGGAAGGGCAAACGGTCCATCTTGATAGGCACTACTAGAAAACTTTTGTCCCTGCCAAGGGCAACCTTTGTCAGGGGGACTTTCACACCCATCACTCGGGGGCCCACACCGATGAGCTCTGGGGGCTAGGCCCATCCATGCCTCCAAAGCCCCTCAGTTCCTGGAACTTGTGGGGACCATGACAAACCATGCCCACTCTCTGGGGGACGCTGTCGGTCCACCGGGCCAGTCTGGGGACAAAAAATCATAGAGGATCCAGCTCCAGGTCCCCCTCTGGTTTTTTCATCCTTCCGGGGTCCTGTGGGTTGCAGTGGGGGGACACTGGACTGGGGAGGGTGGTTTGGTGTTTTGACACGGAGACAAAGGGGGGACTTGGTCCAACCGTCCCCCACACGGGATGGGAAATGAGCAGCTGTCCGTGGATTGCGGTTATTCTCCAGATGGGAAACTTCTTAGCAATCGGCCTCCCATGACAACTGCATCTACATATATGGAGTTTACCGGACAATGGAAGGAAGTACACACGAGTTTGGCAAGTTGCTCCCGGCCATTTCCAGCTTCATCACCCACTTGGGACTGGTCCCGTGAAACTCACAATTTCCTGGTGTCCAAATTTCCGGGGGACTACGAGATCCCTCTACTGGGTTTCCGTCCTGCCTGTAAAGCAAGGTCGTGAGTGTGGGAAAACCACAAGGGGACAATCGAGTGGGGCCCACCTATACCTGCACCTTGGGATTCCCACGTCTTTGGGAGTGTGGGCCGGAGGGCCCTCCGATGGGGGACAGACATCAACGCCCGTCTGCCGGGGCTTCACGAGAGAAAGCTCTTGTGCACAGGCGATGACCTTCGGCAAAAGTGCACCTCTTTCTCATACCCCGTGGCTCCACAGTTTCCGGGCTCCAAGCCACAATCTACAGTGGGGGACACAGCAGGCCCACGTCCCCACCAACCGTGGGACTTCCTCTGTGAAGGACAGCCACCTTAATCTCCACGGGTGGGGGAAAGACACAAGCATCATGCAGTGGGCGAGTCCATTTAGTCCCCTGTGGGAAGCCCGAGGGACGCCAGACTCGAGTTCCGCTTTGGTCCACTGTGATTTTCTGTTTTTTGTCCTACAGGGACTCTTAACAAAAACCTCAGGGAAAAACTGTCCCTCTACCAGTTTACCTTAGTTGGGGAAGCCAGTGCGTGTCACACCAGATAAAGCGGTTTGTGTGTGTCCCGCTTTTGTTATTATAGGGCAGGATAGAAATGCATGTCCGGGTTAAAGGAAGTCCCCAAGGTTTTTTACATGGCAGCAGAAGGGACTGGTGTATCCTTATAGGACACTTTTTCTATGAAACTCTTTCAAAAAATGGTCACAGGAAATGCCCTTTTAAAAAATACTGTATATAGTCTTCACTGCTTCACCTTGTTTAAGTCAGATATTTTATGATAATGAAAGTACGGAAAACTGGGGAAAACTGGGGGTCCCGTTGTTGGACTAATTGGGTCCTAAAGAGGATAAAATTATGTAAACTGATTTTGGCCCAGCTGAAATCAGGACTGCAAAATGCCCAGGCTTTCCCTTGGCCATGTATCTAAAAATCCCATAAAACCTCCCTCCCTTTGGGAGGGCTGGCAAAAAAAGGGGGGCTGTGCTTCTCCCTGGTTTAAGCAGTTTTGTTACTACAGAAACCCCCCGAAGCTGCTGTGGGTCCCTACAGGTGCTTTCCATGCTCTGGTTTAGACTGTTCCCCTGTCCTGAGGGACACAGCCAGTTTTTCTTCAGCACCTCCACAAAAACTGCACCCCCGTCCTCTGTCCATGCACCTCGATTCACGGCGAGGACATTTAAGCCACCACTCTCCTGGCGTATTGATGGCATTGATACGGTTATTGTCCCCTCGTATAGAGTTAAACAACTTACGATAAAATTTGCCCAAAGCTGGGGCCCTTGCTGTGTGTGCCTCTGGACACTGTACATTTTGTACCCAAAACCAAGTGGGTCCAAGTCGGAGAGGGGGACTCTTTCAGTATGGGAGGCCAGCCTGTGGGAGTGGCAGCGCCTTTGAGGCTCTGGGTGTAAGGACAGTTTTCCTTCCCTGGACTCTCGTGCACAGGACAGCATGCAGGCATTACAGACTGAAAACTGGTGCTCTGGCCGAGTAGAAAAAGTAGGTAGGGTCTGAAGGTGTCGAGAAGGGCCTTAACCTGTGGTGTGGGGACAGATTGAATTGATTGTTTTACACTGGGGGGACTGTATCTCGGATCTTTTAAAATAGAGGAAATCACAAAACAGGACTTAAGGGACAGATGCTGAGATTGCTTTTTTTGTAAACTCGTTTAAGCGAGTGAGTGAGTTTGAGTTTACCTGAAACTCTGTAGCACTGGGTTGTTTCATAGTGGATGAAGGGACAGCACTGCAGACATCTCCCCTTGCCATCTCTAGCCTGCCTGTGGAAGGAAAACAAACGTGGACCTCAAGATGAAGCTGTTTTTGTTATGTATCCTTATCAAATATATATTCTATAAGGAAAATAAAAATCTGAAAAGTG
In the second case, minimap2 was able to align the whole pcdha1 sequence, but aligned the first and second exons poorly with a lot more indels inserted - whereas BLAT aligned it at 100% identity with correct exon boundaries. (for the screenshot, focus on the bold "YourSeq" BLAT alignment as the correct one https://www.dropbox.com/s/gruwqhheeelg7tm/Screenshot%202020-08-05%2016.28.06.png?dl=0)
>transcript/5185 full_length_coverage=2;length=4965
GCTAGTCCGAATCGGAACAATGGCGGATGCAGTGGCGATGGACTAACGGATTAGAAGAATTCTCCTAGCTCTGAGAGAATCCCTAATCAGAACAAAGCACTGTGCACTTGAAATGGAATTTTCCTGGGGAAGTGGCCAGGAATCCCAGCGCTTGCTTCTTTCTTTTCTGCTTCTTGCAATCTGGGAGGCAGGGAACAGCCAGATCCACTACTCCATCCCTGAGGAGGCCAAACACGGCACCTTCGTGGGCCGCATCGCGCAGGACCTGGGGCTGGAGCTGACGGAGCTGGTGCCCCGCCTGTTCAGAGTGGCGTCCAAGGACCGCGGAGACCTTCTGGAGGTAAATCTGCAGAATGGCATTTTGTTTGTGAATTCTCGGATCGACCGGGAGGAGCTGTGCGGGCGGAGCGCGGAGTGCAGCATCCACCTGGAGGTGATCGTGGACAGGCCGTTGCAGGTTTTCCACGTGGAGGTGGAGGTGAGGGACATTAACGACAACCCTCCCAGGTTCCCAACAACACAAAAGAATCTGTTCATTGCAGAATCAAGGCCACTTGACACTTGGTTTCCACTAGAGGGCGCTTCAGACGCAGATATCGGAATCAATGCTGTACTGACTTACAGACTGAGTCCAAATGATTACTTTTCTTTGGAAAAACCATCCAACGACGAACGGGTAAAAGGTCTTGGACTTGTATTACGGAAATCTTTAGACCGGGAGGAAACTCCAGAGATAATTTTAGTGCTTACTGTCACGGACGGAGGAAAGCCAGAGCTGACCGGCAGTGTTCAGTTACTCATCACTGTGCTGGATGCCAATGATAATGCTCCAGTTTTTGACAGATCTCTGTATACCGTGAAATTACCAGAAAACGTTCCAAATGGGACATTGGTAGTCAAAGTCAATGCCTCAGATTTAGACGAAGGGGTAAATGGGGATATTATGTACTCATTTTCTACAGATATTTCACCAAATGTGAAATACAAATTCCACATAGACCCTGTTAGCGGAGAGATTATTGTAAAGGGATACATTGATTTTGAAGAATGCAAATCCTATGAAATTCTCATAGAGGGAATTGACAAGGGACAACTTCCACTCTCTGGGCACTGTAAAGTCATTGTACAAGTTGAAGACATCAACGATAATGTTCCAGAATTGGAATTCAAATCTCTATCACTTCCAATACGAGAGAATTCTCCAGTGGGCACTGTCATCGCACTCATTAGTGTGTCTGATCGGGACACGGGTGTCAACGGGCAGGTGACCTGCTCCCTGACAAGTCATGTCCCCTTCAAGTTGGTGTCCACATTCAAGAATTACTATTCGCTCGTGCTGGACAGCGCCCTGGACAGAGAGACAACAGCGGACTATAAGGTGGTGGTGACAGCGCGGGATGGGGGCTCTCCCTCGCTGTGGGCCACGGCTAGCGTGTCTGTTGAGGTTGCTGACGTGAACGACAATGCACCTGTGTTCGCGCAGCCCGAATACACGGTGTTCGTGAAGGAGAACAACCCGCCTGGTGCGCACATCTTCACGGTGTCAGCGATGGATGCGGACGCACAGGAGAACGCGCTGGTGTCCTACTCGCTGGTGGAGCGGAGGGTGGGCGAGCGCTTGCTGTCGAGCTATGTGTCTGTGCACGCGGAGAGCGGCAAGGTGTTCGCGCTGCAGCCTCTGGACCATGAGGAGCTGGAGCTGCTGCGGTTCCAGGTGAGCGCGCGGGATGCTGGTGTACCTGCCCTGGGCAGCAATGTGACTCTGCAGGTGTTTGTGCTGGACGAGAATGACAACGCGCCCACACTGCTGGAACCTGAGGCAGGAGTCTCTGGTGGAATCGTGAGCCGGTTGGTGTCCAGATCAGTGGGTGCAGGCCATGTGGTGGCTAAGGTGCGCGCGGTGGATGCAGACTCTGGCTATAATGCATGGCTCTCTTATGAGCTGCAATCGTCAGAAGGCAATTCCCGTAGCCTTTTCCGCGTAGGTTTGTATACGGGCGAGATTAGTACTACGCGCATACTGGATGAAGCAGATTCGCCACGTCAGCGCCTTCTGGTGCTGGTGAAGGACCATGGTGACCCAGCAATGATTGTTACCGCCACAGTGTTGGTGTCTCTGGTAGAGAATGGCCCGGTACCAAAGGCTCCATCGCGAGTATCCACGAGTGTCACACACTCTGAGGCGTCACTGGTGGATGTCAACGTGTACCTGATCATTGCCATCTGTGCAGTGTCCAGCCTGCTAGTGCTCACGCTGCTGCTGTACACAGCGCTGCGCTGTTCCACTGTCCCCAGTGAGAGCGTGTGCGGGCCTCCAAAACCGGTAATGGTGTGCTCCAGTGCAGTGGGGAGCTGGTCATACTCCCAACAAAGGAGGCAAAGGGTGTGCTCTGGGGAGTACCCACCTAAGACCGACCTCATGGCCTTCAGCCCCAGTTTATCTGATTCAAGGGACAGAGAGGATCAATTGCAGTCTGCAGAGGATTCCTCTGGAAAGCCCCGGCAGCCCAACCCTGACTGGCGCTACTCTGCCTCGCTAAGAGCAGGCATGCACAGCTCTGTGCACCTGGAGGAGGCTGGCATTCTACGGGCTGGTCCAGGAGGGCCTGATCAGCAGTGGCCAACAGTATCCAGTGCAACACCAGAACCTGAGGCAGGAGAGGTGTCCCCTCCGGTGGGCGCCGGTGTCAACAGCAACAGCTGGACCTTTAAATACGGACCAGGCAACCCCAAACAGTCCGGTCCCGGTGAGTTGCCAGACAAATTCATTATCCCAGGATCTCCTGCAATCATCTCCATCCGGCAGGAGCCTGCTAACAACCAAATTGACAAAAGCGATTTTATAACCTTCGGCAAAAAGGAGGAGACCAAGAAAAAGAAGAAAAAGAAGAAGGGTAACAAGACCCAGGAGAAAAAAGAGAAAGGGAACAGCACGACGGACAACAGTGACCAGTGAGGCCACCAAATGGAAACAAGCCACTTAGCCAGTTTTTGTAATAATGGCAAATCTCTCCCATGTAGCAACTCCCCGCTCCTTTCTCCTATGACATGAGCCCTCAGAAATCTGCAGAAAGTTCCCTGTGTCTGTCTTGATCGCATTTAACAGGTTTTGTCTTAAAAAGCTTTCCTAAGTCTGGTGTTAACTCTCTCTCTCTCCACTCTGGCTTGTTTTCAGAACCTAAAAAGCAGACCCAGGTTTCCTTTCTCCCCCGCCGCAAAGGAGAAGCTTCCCAGCCCCGCCAGTGAGAGTTGGACTCTCTGCCCTGTGCTTCAAGCATCCTGTCTTGATGATATTTGCAGGGCAGGCTGAAAGGTATTCAGGTTGAGCAGTTGGGTGTTTGTGGTCACTGGGTATGTGTGGCTACCAAGAGTGTTGGAGAGCCTGGTATTGGCTGGGATGGTCCAGATTAGACTAGTTAACACAAGGAGGGCTGGGGCTCAAAGGCACATCAACACCGGGAGTCTTTCATCTGGAGGGGGAAAATGTGAAACTTACAAGGACCAGACTTTCTCAATCTCTCAACTAGACATATGATGGCCATCCTCTAACAGACAAAACCATCCCCACCGGCAAAGCTTTAGGAGCCCCTCAAGTGTGCTGGCTATAACATCACTGTATTCAAAACCTGCAGTATGCACACGAGCCAGCAGTTCAAGCGTTTAACAAGAGGGTCGGCCAGGGCAACAGAAGCAGATCTGATGTGTTTCCTGTACACGTCCTTGTGCTCACGCTATTAAAAATTCTTTTGCACACAATGTTTATGAAAAGGTCTCATCCTTTTCCAACAACACATATGCAAAAGCAAAAGAAAAACCCAAGACCTCACTTTATGCTGTTTGTTGTTTGATAGATTTATTAAAAAGAAAAGAGAAAGTCGATAGCTATAAATCTTTAAAGGAATATGATGAATACAATCCCCCCAACCTTCCCTCAAAAGAGAATCCAGTCTACAGCCATTTGAAATGATCGTTGCTGCTACAGAAGTGCTTTAAGAGAATTGCCTGGAACATCTGTATTATCCCGGCCACCTGCCAATCACAGCTTTACTCTTTCAGGTCACTCTGGGGCTGCCTCTTGCATGTATTAATTACTAAATAGAAGGATCTTTCTCTCTTTTCTAAGAAAAATGATGTGCACTTTGATTACACAACCTTCTCTAACCCACGATATCAAGACCCAGAAACTGAAGAAAAATCTTGTTTTCTCATGCATACAGTGAGCAGACTTTTCATTCCTCTGGTTCTGTGGTTGTCTCGGTGTGCTAGCCTACACCTCCACCTTGTTTAGCTTTCCTTTTCTAGAACACTCTGAATTGCTAACCTTACTAACACCTATGATGTTACCTGAATCAATCTCCCATATGTATGCTGTATGCTATTATAAGACTCCTGAGATATACTTACTCTGTGCTTGTGTATGTGAATGTTAATGCAACTATTACCTAGAGTGAACTTTAAGCTTTATTGTTGAATGTAAGCCCATTATATTTCCTTTTGTACACCTGTGGAAAAGTGGAATAGTGTTTTTTTTAAAACCATTGTTAATCAGCTTTTGTGTATGAAAGACACAGTAAAATTTCTTTCTTAAATCAAGATGCTGGTGATTCAAGGAATTTTATTTATGGTCAGCCAAGGGCTGTCTCTTGCCAAGAATCCTGCTGGCAAGGGAAAATGGATAAAGCTGGTTTTTTTTTTTCCTAGTAAGAATTCTGGAATAAATACTGAAGAAAGTCCCTGAGGGTATGCAAGCACAAAATTGTACCAATCTGACCTCTTTGAATTTGCAGACTGCTTTGAAATGCTATCCGGAATATCAGCTTGTAGAAAGTAATAAAATTTACTGTTACCATAAATAAGACATTTTAAGTTTATTGTGCACAACTTAGATGTTTGATTAATTATATTATCTACTTAAAAGCATATAAAAGAGGTAGGAGTCTGTTTTAAAAGGCATAAAAAATCTCT
Your insight is appreciated! -Liz