minigraph
minigraph copied to clipboard
Mappings not full-length, alignment extension problem?
Hi,
Read bases before and after query start
and end
reported in minigraph mappings sometimes perfectly match the segment (S) sequence. Here is an example where I converted GAF to BAM and treated bases as soft clipped after the query end to visualize them in IGV:
Below is the read from the IGV image that aligns full-length using minimap2 and GRCh38 but gets "clipped" using minigraph:
minigraph --vc -k 15 -w 10 -cx lr GRCh38-90c.r518.gfa.gz read.fa
>79c62919-2910-4b7d-92d3-78016cd6679b
TGTACTTCGTTCAGTTACGTATTATAGGGGTGACCAGGGCCGGTTGAGCACTCACGTGTGGCTATCTCTGTGCTCTGTGGCAGGTGACGGATGGTGGCACCACTCAGCAAAGATATCTTCACCTTCGACACCATGTTCTCCACCAACTACTCACACAGAGGAGAACTACCGCAAGCGAGGGACCTGGTGTACCAGTCCACTGTGAGGTGAGTGCCTGGGGGTGGCGGGGTGACAGCGGGGGAAGGGCGGAGGGATGGGGAGTGGGGCAGAGAGCAGTTCTCCAGCCCTTTCACATCAATCCCTCGGTGCTCTTGGGCTTGGAAGGCAGAGACTGGGGGCTCCCCTTGTAAGAGGTGGGACCCATTCCTGGAGGACATCCCCTGGAGGGAAGAGGCAGGAGAGGGCCAGGCCACCGCTCTTCTGACTGGCCTCCTCTGCAGCTCTCCAGCAGTCCTGGAAGGTGGGTATTCCAAACTCCATTTATTTCCAAGTGAGATACTGAGGCCCAGAGAGGGAGCAATGGGCCTAAAGTCACACAGGCAACAATGACACAGCTGGGCGGTGAAGCAGTTCTGCCCGGCTTGGAAGCTCAAACACCACATTCTACTGTTTTTGTGTGCTCCAAGGCCATAAGGCCCCGAGCTTAGAGAATGACCCCAAGCAACTGGACTCCCAGGTCAGAAGCAGCAGGGTGGGAAGGAGCAGTGCTCAAGTCGGAGATGCTTGATTTTCCTACCCCATCCCTGTTTTCCAGGAGGCAGGTGGCAGAATCAAGCTGACTCTGATCTTCCAGAGCCTGCTGTTCTCTATGTGTGTGGCTCTATCTCCTTCCCACAAACTCCCCTACAAGATCCTAGTCCTGTTTTATCCCTTTGACCCTTAACACACACAGGTCCCTCTTCCTCAAGGGCCTTTCTCCCTTCGGTCATCTGGTAAACTCCTCATCAGCCTTCAGAACCCCACTCCAGCAATTCCACCCATGGGAGCCCTCACCTTAGCTGAACTGATCCCCCAAAGCTGGTCGGCTGCAACTGCCCCAGTACCCACCCCTCCTCAGAGGGAATCTCACCTGCATTCATGAAGAGTGCACAGCTTTGCACACACACTCTTCCACGCCACACCGCCAAATGCTTTGCCAGGTGCCCAGCCTGAAAGTAGTGGTCTGCTCCAAAGCTGGAGGGGGACCGTGCTGTGGGGTACCGTGAGGGGGACTGCAGCCTCGGCAAGCCCACATGATGTGCACTGTGGCACCATGCTATGTGGATGTCACAGCTAGTGCCTGGTACTGTGTTGTCTTTAGACTGAGCCTGGAGGAGGGGGAAGCCGTACTCTCCAGCACGGGCTGTGGGTCTCTGTGACCGTGAGCTCCAGGAGGACAAGGCACATGCTCCTCCTTCACCTTGCATCCCCAGCTCTCAGCCCAGGGCCTGGCACACAGTGGGTACATAATGAAATTTGCAAAATGAGTGGACAACTCAGGCTACACTCCTCAGCCCTTCAGGCCTGTTTTCCTATTGGGTGAAATTAGAAAGCGGGACAAGAAGATTGTAAAGGGTAAGTCCAACCCTGCATCAAGAATCTGACCCAGGCATCCTCCTCTGCCTGCTTTTGTAAAAATGGGAAGGTGAGTGTGCTCCCCAGGCGGCCCGGCTGGGTCCTCGTCTCTCTGGGCTCTGTGCCAGGACCATGCTGAGAATGTGGACGTGAAGGTGTGTGAAAGAAGGTGTGTGAACAGCCGCACAAACTCCTTCATCCCCCTAGTTATGATTTCTCCTGAACCTGCTTCCAAAAAGAAATTATGGTGCCTTAAAATATTAAAGCACACCTAAAACAAGCCAGTTACAATCAAGATGAAAAAGGAGATAAAAATTCATTTTGGAGAAAATAGCTATTCTGGAAACCACAGATGGTTTGTAGTAAGTGAACTCCCAAATAATCTCCACGACTCCCAGAGGAAGCCCTAATGCCACCCAGCCCACCCCTGCAGGTCACTGTGTCTCAGAGAATTCTTTCCTCACCAAGCCACCCGCTGCCACCAGCAAGGCCTCTAGGACTCCTTGGTTCTGATTTCTAGCTAAGAAAGAAACCCAGGGACCATCTAGAACATTCCTGAGCACCACCCTCTCCTGCTTCCTGGAGCCTCAAGGGCAAGCCCTGCTACAGAAGACCCAGATGAAAATCCCAGCTCAGTCTCTGCCTACTGTATGACCTGGGATGAGAGCCTTAACCTCTCTCTCTCTCTGCTTCAGGTCCTTCGCCTGTAGAGCAGACTTGACATGACAGGCTTTGTACAATTGTGGAATTTCAGTGAAATTTATGTCCAACAGCCTGAGTTCAAATCCTGCCTCTTCCACATCCTGGCATTAGTGAAATCTGGGGCTGGTAACCCCACAGTTGAGTTTCCTGACCTCCAAAGTAACATGACAGCACTAAGAGGCTAGCTGCAAGTTGTGAGGAGCTGATGGCGTGCATTCAAGTGAAGTGCTGGTTTCCGGCCTCTAAGTGGTACATGTTTGCTGCCATTGCCTAGAACAACCTCTCCTTCTCCCACCATAAAACAAAACAAAAAAAATTCACCTTTTTACAAATCTCATAAAGCCTCCTTTTATGCGCTTTCTGTTTTGATCTGCACAAACAACCCCATGAATGCAGGCAGGTATTATTGTTATTAATCATTTTCTCTAGCCCTGTTCTTACTACAGAAGAGGTCAAAGTTCAGCAGGCCCAACAGTCTCCCTCTGGAGATGACATAGACTTGACCAGACCTTTTCAGTTGTGGAAGCCAAGGCCCAGAGAGGGAATGAGGCTTGCCTGAGGTCACACAGCGGCCCTGACTTAAGCCTGATATCCAGGGAGCCAGTGCTCTTTTCTTCCCCACATGGGGGCCAATCTCCCTGAGCCTCAGTAACAGGGGCGGTCTCCCAAGCAAAGAACACCCCAGGGAGCTGAGCCGTCCTGTTTGGTGACCACCAAGGGAGAGGGAAGGAAGGAGCAGACCCTCTGGAGCCTAACGGGATCACGGAGGATTGTCAGCCCTCCCAACCAGCCATGCTGGGGGCGACTGCACCTCTTCAGGCTGCAGAGCCCCATCAGGGCTGGCGACGGGCGCGGGACTCCGCATTTACATATAAATGAAGATACACCCCAAATTAATGCATCTCAACAGAGAAGGAGTCCCAGCTCCACAGCTGTCTGAGCCGTTTCCTCTTCAGCTGGGGAAAGTAGGGAGAGAATGAGACTGAGAAGATTACCAAGAATTGGCTGAGAGCAATTTCGCGGGTAGCCTAGAGGAGCCCCAAGGACTCCTGGAGCTTTTTGTCCCACTGATGGGGCCCTGGAGGGATGGAAGTCGGGTGGATCATCAGGACTTCTGCCCCCAGGGGGTGCAGGTGTGTATCCCTGGTTGTATGACCTTGGACAGATCACTGCCTCTCTGGATAGAGCGTCCCAGAGGCACATCTATCAAATGCTTTTCTAGGTCTTAGCCCCAGAGCCCAGAACTCCCAGGTTGGAATTGACAAGAATGTATTCCCTCCAAGGCTATTTAGTTGAGCACCTACTAGGACCTTGGCTTGGTGGAAGGTGCCCAGGGATACAGGGGGAATGAGAAAAGCACGGCTTTGCAATCTAGTCGGAGAGACATGGCAAAATGTAGATGAATACATCATTCCACATTCTGCAAAGTGATTCTAAACCATGGGCGTGTAATGAAGGGGCAAGTGGGGTGCAGTACCGAGGAAGGAATGAGATGCTGGGTGCCATTTAGATGGGTGGTCAGGCAGGGCCATCGAGGAAGGTGGCACTTGGGTGGAGAACTGAGGGCAGGAAGGAGCCCTCTGTGAAGGGAGGAGGCAGATGTGGCCGACAGTTATCCCACCCTGCCCACTTCCTCTTACCCCTCGCTCTTCCTGGCAGTGCCAGCATCCTTGGTGCTTCCACAGGGCTTTCTCAGGGCACTGGAGACCACTCAGCCTGCACATGAGAAAGGTGAGAAGTGCCGGGGAGTTGACGCCACCCCGGGAGCAGCCTTGACCAATGACTGTTTATTGGGTGCATACCCAGCCCCCTCGTCCCTCCAGGGGGACAGTTCAGATGCATGCTCAGGAGGACAAAGTACAAGTTCCTCCTTCACCTCGTATGCCCAGCTCTCAGCCCAGGGCCTGGCACACAGTGGGTGCATAATGAAATGTGCAAAATGAGCGGGCAGTTCTGGGTATGCTCCTTGGCCCCTTGGGGGTGTTTCCTAATAAATTAAAATTAGAAAATGGGATAAGAAGGGGGACATTGAGAAACTATGGGATGAATGCTGTCAACTCTCTCCCGGAGTTTCTCAGTGACCCCCCAGAATAACCTTGAGAAGGCACCCATGTTGGCTTCCTTCCCTTCTAGCTCATTCCATCAGCTTCCTGGGACCCCCTTCTCCCACATAAACCCTTTCCTCCTAAATCCTTGTTTCAAGGTCTACTCTTCATTCCACCCTTCTAGGTTCTTGGCTGGGTCCCCAGTAGCAAAAGACGAATTCACAAGACAAAAGGATACAAGTCTAGTTTGTATACGTTTTAGGTGATGCAGGAGATTTTATAAGGAAATGAAGACCCACAGAAGTGGTTCTAGCTGAGTGTTTTGCTGGGTTGGATGAAGGGTGGGGAGTCATGGGAAAGTGTGAAGGATAGAAGGACCTGAGCTAAGGGCAGTAAACTGGGGACACTCAGCAGGGTGGCTTGTTCAGATCCCTTGGAAGTGAAGATGCTGCCTTCCTCCAGGTACAGAGAGGGAACCTCACGTGAAGATCTTCATGACCTGCTTCAGGGAAAGGTCAGAGAGTCCTTCCTGCGCCTGCCATTTCTCAAATTCCATCTGCTTAAAATATTCAATATACCAAGGTGCCAAATTTGAGGTGGCGTGTCCTGGAACCCCCTCCCATGGAATGTGGCCATTCCAGAGGGCAGCTGTGACCTCCTTGGAGGTCAAGTGAATCACAGTGAGGCTTCAACCACTCCGCCCACACACATGCTGTGTCTTCCATTTCCTGAAGCACCCTGAGGCCCTTGTCCCATTCGATCTGCTCAGGAAGCGATGAGGAACACTGGACAGATTTCACTGGCCTTAATTCACTGGTGCAGGGAATCTAGATTCGAGGGGCTCAGTGTATTCCTGCCCCCGACCTGCTATTGCAGATCCCTCCAACACCTTGGCCAATGCTCTTTGCCAACAAGGTCCTACTTGTCCTTTCAAGTTCAGCTTAAAGGTTGCTTCCTACAGGAAGTCAGCCCCAACTGCTCAAGCTCACCGACTGATGCTATGTCTTCAGAGTCTGCCCGCCCCCCCACTAGACTGGAAGCCCCATGAGGACAGGGACTAAGGTCTGATGTTTTTCACCACCATCCTCTCCCCAACCCCCAGGACAGCACCCAACCCTGGGCATGTGGTAGATGATTTGTAAATATCTGTGGGGTGAATGAGTTCGTGGTGAACATGGTGGTGGAGTCCTTTCAATGCAACACTTGCTGCTGGGGTGGATGAACAAGACACCTTAGGAAAGGAGCATCAGGCCCCCGTTCCTTCCCCCAACACTCTCCACATCATTCATTCATCACAAAGGAATTGAGTAGTTACCATATATCAGGGACAATGATGGAGATAAAGTGGTGAACAAGACATAATCAGACCTGGAGGTTTCTGATCTAGTGGGTGAGGAGAGACACACCTAAAAATGCCTTTAAAATAATAAGGAAAAACAGTTACAG
Thanks, Tobias