minigraph icon indicating copy to clipboard operation
minigraph copied to clipboard

Mappings not full-length, alignment extension problem?

Open tobiasrausch opened this issue 1 year ago • 0 comments

Hi,

Read bases before and after query start and end reported in minigraph mappings sometimes perfectly match the segment (S) sequence. Here is an example where I converted GAF ​​to BAM and treated bases as soft clipped after the query end to visualize them in IGV:

softclips

Below is the read from the IGV image that aligns full-length using minimap2 and GRCh38 but gets "clipped" using minigraph:

minigraph --vc -k 15 -w 10 -cx lr GRCh38-90c.r518.gfa.gz read.fa

>79c62919-2910-4b7d-92d3-78016cd6679b
TGTACTTCGTTCAGTTACGTATTATAGGGGTGACCAGGGCCGGTTGAGCACTCACGTGTGGCTATCTCTGTGCTCTGTGGCAGGTGACGGATGGTGGCACCACTCAGCAAAGATATCTTCACCTTCGACACCATGTTCTCCACCAACTACTCACACAGAGGAGAACTACCGCAAGCGAGGGACCTGGTGTACCAGTCCACTGTGAGGTGAGTGCCTGGGGGTGGCGGGGTGACAGCGGGGGAAGGGCGGAGGGATGGGGAGTGGGGCAGAGAGCAGTTCTCCAGCCCTTTCACATCAATCCCTCGGTGCTCTTGGGCTTGGAAGGCAGAGACTGGGGGCTCCCCTTGTAAGAGGTGGGACCCATTCCTGGAGGACATCCCCTGGAGGGAAGAGGCAGGAGAGGGCCAGGCCACCGCTCTTCTGACTGGCCTCCTCTGCAGCTCTCCAGCAGTCCTGGAAGGTGGGTATTCCAAACTCCATTTATTTCCAAGTGAGATACTGAGGCCCAGAGAGGGAGCAATGGGCCTAAAGTCACACAGGCAACAATGACACAGCTGGGCGGTGAAGCAGTTCTGCCCGGCTTGGAAGCTCAAACACCACATTCTACTGTTTTTGTGTGCTCCAAGGCCATAAGGCCCCGAGCTTAGAGAATGACCCCAAGCAACTGGACTCCCAGGTCAGAAGCAGCAGGGTGGGAAGGAGCAGTGCTCAAGTCGGAGATGCTTGATTTTCCTACCCCATCCCTGTTTTCCAGGAGGCAGGTGGCAGAATCAAGCTGACTCTGATCTTCCAGAGCCTGCTGTTCTCTATGTGTGTGGCTCTATCTCCTTCCCACAAACTCCCCTACAAGATCCTAGTCCTGTTTTATCCCTTTGACCCTTAACACACACAGGTCCCTCTTCCTCAAGGGCCTTTCTCCCTTCGGTCATCTGGTAAACTCCTCATCAGCCTTCAGAACCCCACTCCAGCAATTCCACCCATGGGAGCCCTCACCTTAGCTGAACTGATCCCCCAAAGCTGGTCGGCTGCAACTGCCCCAGTACCCACCCCTCCTCAGAGGGAATCTCACCTGCATTCATGAAGAGTGCACAGCTTTGCACACACACTCTTCCACGCCACACCGCCAAATGCTTTGCCAGGTGCCCAGCCTGAAAGTAGTGGTCTGCTCCAAAGCTGGAGGGGGACCGTGCTGTGGGGTACCGTGAGGGGGACTGCAGCCTCGGCAAGCCCACATGATGTGCACTGTGGCACCATGCTATGTGGATGTCACAGCTAGTGCCTGGTACTGTGTTGTCTTTAGACTGAGCCTGGAGGAGGGGGAAGCCGTACTCTCCAGCACGGGCTGTGGGTCTCTGTGACCGTGAGCTCCAGGAGGACAAGGCACATGCTCCTCCTTCACCTTGCATCCCCAGCTCTCAGCCCAGGGCCTGGCACACAGTGGGTACATAATGAAATTTGCAAAATGAGTGGACAACTCAGGCTACACTCCTCAGCCCTTCAGGCCTGTTTTCCTATTGGGTGAAATTAGAAAGCGGGACAAGAAGATTGTAAAGGGTAAGTCCAACCCTGCATCAAGAATCTGACCCAGGCATCCTCCTCTGCCTGCTTTTGTAAAAATGGGAAGGTGAGTGTGCTCCCCAGGCGGCCCGGCTGGGTCCTCGTCTCTCTGGGCTCTGTGCCAGGACCATGCTGAGAATGTGGACGTGAAGGTGTGTGAAAGAAGGTGTGTGAACAGCCGCACAAACTCCTTCATCCCCCTAGTTATGATTTCTCCTGAACCTGCTTCCAAAAAGAAATTATGGTGCCTTAAAATATTAAAGCACACCTAAAACAAGCCAGTTACAATCAAGATGAAAAAGGAGATAAAAATTCATTTTGGAGAAAATAGCTATTCTGGAAACCACAGATGGTTTGTAGTAAGTGAACTCCCAAATAATCTCCACGACTCCCAGAGGAAGCCCTAATGCCACCCAGCCCACCCCTGCAGGTCACTGTGTCTCAGAGAATTCTTTCCTCACCAAGCCACCCGCTGCCACCAGCAAGGCCTCTAGGACTCCTTGGTTCTGATTTCTAGCTAAGAAAGAAACCCAGGGACCATCTAGAACATTCCTGAGCACCACCCTCTCCTGCTTCCTGGAGCCTCAAGGGCAAGCCCTGCTACAGAAGACCCAGATGAAAATCCCAGCTCAGTCTCTGCCTACTGTATGACCTGGGATGAGAGCCTTAACCTCTCTCTCTCTCTGCTTCAGGTCCTTCGCCTGTAGAGCAGACTTGACATGACAGGCTTTGTACAATTGTGGAATTTCAGTGAAATTTATGTCCAACAGCCTGAGTTCAAATCCTGCCTCTTCCACATCCTGGCATTAGTGAAATCTGGGGCTGGTAACCCCACAGTTGAGTTTCCTGACCTCCAAAGTAACATGACAGCACTAAGAGGCTAGCTGCAAGTTGTGAGGAGCTGATGGCGTGCATTCAAGTGAAGTGCTGGTTTCCGGCCTCTAAGTGGTACATGTTTGCTGCCATTGCCTAGAACAACCTCTCCTTCTCCCACCATAAAACAAAACAAAAAAAATTCACCTTTTTACAAATCTCATAAAGCCTCCTTTTATGCGCTTTCTGTTTTGATCTGCACAAACAACCCCATGAATGCAGGCAGGTATTATTGTTATTAATCATTTTCTCTAGCCCTGTTCTTACTACAGAAGAGGTCAAAGTTCAGCAGGCCCAACAGTCTCCCTCTGGAGATGACATAGACTTGACCAGACCTTTTCAGTTGTGGAAGCCAAGGCCCAGAGAGGGAATGAGGCTTGCCTGAGGTCACACAGCGGCCCTGACTTAAGCCTGATATCCAGGGAGCCAGTGCTCTTTTCTTCCCCACATGGGGGCCAATCTCCCTGAGCCTCAGTAACAGGGGCGGTCTCCCAAGCAAAGAACACCCCAGGGAGCTGAGCCGTCCTGTTTGGTGACCACCAAGGGAGAGGGAAGGAAGGAGCAGACCCTCTGGAGCCTAACGGGATCACGGAGGATTGTCAGCCCTCCCAACCAGCCATGCTGGGGGCGACTGCACCTCTTCAGGCTGCAGAGCCCCATCAGGGCTGGCGACGGGCGCGGGACTCCGCATTTACATATAAATGAAGATACACCCCAAATTAATGCATCTCAACAGAGAAGGAGTCCCAGCTCCACAGCTGTCTGAGCCGTTTCCTCTTCAGCTGGGGAAAGTAGGGAGAGAATGAGACTGAGAAGATTACCAAGAATTGGCTGAGAGCAATTTCGCGGGTAGCCTAGAGGAGCCCCAAGGACTCCTGGAGCTTTTTGTCCCACTGATGGGGCCCTGGAGGGATGGAAGTCGGGTGGATCATCAGGACTTCTGCCCCCAGGGGGTGCAGGTGTGTATCCCTGGTTGTATGACCTTGGACAGATCACTGCCTCTCTGGATAGAGCGTCCCAGAGGCACATCTATCAAATGCTTTTCTAGGTCTTAGCCCCAGAGCCCAGAACTCCCAGGTTGGAATTGACAAGAATGTATTCCCTCCAAGGCTATTTAGTTGAGCACCTACTAGGACCTTGGCTTGGTGGAAGGTGCCCAGGGATACAGGGGGAATGAGAAAAGCACGGCTTTGCAATCTAGTCGGAGAGACATGGCAAAATGTAGATGAATACATCATTCCACATTCTGCAAAGTGATTCTAAACCATGGGCGTGTAATGAAGGGGCAAGTGGGGTGCAGTACCGAGGAAGGAATGAGATGCTGGGTGCCATTTAGATGGGTGGTCAGGCAGGGCCATCGAGGAAGGTGGCACTTGGGTGGAGAACTGAGGGCAGGAAGGAGCCCTCTGTGAAGGGAGGAGGCAGATGTGGCCGACAGTTATCCCACCCTGCCCACTTCCTCTTACCCCTCGCTCTTCCTGGCAGTGCCAGCATCCTTGGTGCTTCCACAGGGCTTTCTCAGGGCACTGGAGACCACTCAGCCTGCACATGAGAAAGGTGAGAAGTGCCGGGGAGTTGACGCCACCCCGGGAGCAGCCTTGACCAATGACTGTTTATTGGGTGCATACCCAGCCCCCTCGTCCCTCCAGGGGGACAGTTCAGATGCATGCTCAGGAGGACAAAGTACAAGTTCCTCCTTCACCTCGTATGCCCAGCTCTCAGCCCAGGGCCTGGCACACAGTGGGTGCATAATGAAATGTGCAAAATGAGCGGGCAGTTCTGGGTATGCTCCTTGGCCCCTTGGGGGTGTTTCCTAATAAATTAAAATTAGAAAATGGGATAAGAAGGGGGACATTGAGAAACTATGGGATGAATGCTGTCAACTCTCTCCCGGAGTTTCTCAGTGACCCCCCAGAATAACCTTGAGAAGGCACCCATGTTGGCTTCCTTCCCTTCTAGCTCATTCCATCAGCTTCCTGGGACCCCCTTCTCCCACATAAACCCTTTCCTCCTAAATCCTTGTTTCAAGGTCTACTCTTCATTCCACCCTTCTAGGTTCTTGGCTGGGTCCCCAGTAGCAAAAGACGAATTCACAAGACAAAAGGATACAAGTCTAGTTTGTATACGTTTTAGGTGATGCAGGAGATTTTATAAGGAAATGAAGACCCACAGAAGTGGTTCTAGCTGAGTGTTTTGCTGGGTTGGATGAAGGGTGGGGAGTCATGGGAAAGTGTGAAGGATAGAAGGACCTGAGCTAAGGGCAGTAAACTGGGGACACTCAGCAGGGTGGCTTGTTCAGATCCCTTGGAAGTGAAGATGCTGCCTTCCTCCAGGTACAGAGAGGGAACCTCACGTGAAGATCTTCATGACCTGCTTCAGGGAAAGGTCAGAGAGTCCTTCCTGCGCCTGCCATTTCTCAAATTCCATCTGCTTAAAATATTCAATATACCAAGGTGCCAAATTTGAGGTGGCGTGTCCTGGAACCCCCTCCCATGGAATGTGGCCATTCCAGAGGGCAGCTGTGACCTCCTTGGAGGTCAAGTGAATCACAGTGAGGCTTCAACCACTCCGCCCACACACATGCTGTGTCTTCCATTTCCTGAAGCACCCTGAGGCCCTTGTCCCATTCGATCTGCTCAGGAAGCGATGAGGAACACTGGACAGATTTCACTGGCCTTAATTCACTGGTGCAGGGAATCTAGATTCGAGGGGCTCAGTGTATTCCTGCCCCCGACCTGCTATTGCAGATCCCTCCAACACCTTGGCCAATGCTCTTTGCCAACAAGGTCCTACTTGTCCTTTCAAGTTCAGCTTAAAGGTTGCTTCCTACAGGAAGTCAGCCCCAACTGCTCAAGCTCACCGACTGATGCTATGTCTTCAGAGTCTGCCCGCCCCCCCACTAGACTGGAAGCCCCATGAGGACAGGGACTAAGGTCTGATGTTTTTCACCACCATCCTCTCCCCAACCCCCAGGACAGCACCCAACCCTGGGCATGTGGTAGATGATTTGTAAATATCTGTGGGGTGAATGAGTTCGTGGTGAACATGGTGGTGGAGTCCTTTCAATGCAACACTTGCTGCTGGGGTGGATGAACAAGACACCTTAGGAAAGGAGCATCAGGCCCCCGTTCCTTCCCCCAACACTCTCCACATCATTCATTCATCACAAAGGAATTGAGTAGTTACCATATATCAGGGACAATGATGGAGATAAAGTGGTGAACAAGACATAATCAGACCTGGAGGTTTCTGATCTAGTGGGTGAGGAGAGACACACCTAAAAATGCCTTTAAAATAATAAGGAAAAACAGTTACAG

Thanks, Tobias

tobiasrausch avatar Jan 25 '23 09:01 tobiasrausch