minimap2 icon indicating copy to clipboard operation
minimap2 copied to clipboard

Weird self alignment

Open gt1 opened this issue 6 years ago • 1 comments

Hi,

I see the following somewhat weird data (SAM format) coming out of minimap2:

L0/46/0_12879	256	L0/46/0_12879	1	0	8963M116I28M116D203M110I9M1I15M1D22M110D3412M	*	0	0	*	*	tp:A:S	cm:i:15	s1:i:126	NM:i:454	ms:i:24372	AS:i:24744	nn:i:0
L0/46/0_12879	256	L0/46/0_12879	1	0	8963M116D28M116I203M110D9M1D15M1I22M110I3412M	*	0	0	*	*	tp:A:S	cm:i:15	s1:i:127	NM:i:454	ms:i:24372	AS:i:24744	nn:i:0

This is weird for two reasons:

  1. This describes an alignments of a read against itself running from end to end, but clearly not the optimal alignment between the two regions specified.
  2. There are two versions.

minimap2 was run with as

minimap2/minimap2 -ax ava-pb dup.fasta dup.fasta

where dup.fasta contains a single read (synthetic E. coli, though I think this should not matter).

The single read in dup.fasta is

>L0/46/0_12879
AGTCCTGCGGAAAGCGCCAGGGCGAGGATTGCTGTGGCAGGTTTACGTAATTGCATATCCAACTCCTTTATCTCTCTGCG
TTAAGAACGCACTGGAATACCCGTTGTGAGTGTTTTGTGTTGTTACGTCTGCAACTTTATTGTGCAGTGTGTGCCTGTTA
GGGAAGGTGCGAATAAGCTGGGGAAATTCTTCTCGGCTGACTCAGTCATTTCATTTCTTCATGTTTGAGCGATTTTTTCT
CCCGTAAATGCCTTGAATCAGCCTATTTAGACCGTTTCTTCGCCATTTAAGGCGTTATCCCCAGTTTTTAGTGAGATCTC
TCCCACTGACGTATCATTTGGTCCGCCCGAAGACAGGTTGGGCCAGCGTGAATAACATCGCCAGTTGGTTATCGTTTTTC
AGCAACCCCTTCGGTATCTGGCTTTCACGTAAGCCGAACTGTCGCTTGATGATGCGAAATGGGTGCTCCACCCCTGGCCC
GGATGCTGGGCTTTCATGTATTCGATGTTGATGGCCGTTTTGTTCTTGCGTGGAATGCTGTTTCAAGGTACTACCTTGCC
GGGGCCGCTCGGCGATCAGCCAGTCCAATATCCACCTCGGCCAGCTCCTCGCGCTGTGGCGCCCCTTGGTAGCCGGCATC
GGCTTAGACAAATTGCTCCTCTCCATGCAGCCAGATTACCCAGCTGATTGAAGGTCATGCTCGTTGGCCGCGAGTGGTGA
CCAGGCTGTGGGTCAGGCCACTCTTGGCATCGACACCAATGTGGGCCTTCATGCCAAAGTGCCACTGATTGCCTTTCTTG
GTCTGATGCATCTCCGGACTCGCGTTGCTGCTCTTTGTTCTTGGTCGAGCTGGGTGCCTCAATGATGGTGGCATCGACCA
AGGTGCCTTGAGTCATCATGACGCCTGCCTTCGGACCAGCCAGTCGATTGATGGTCTTGAACAATTGGCGGGCCAGTTGG
ATGCTGCTCCAGCAGGTGGCGGAAATTCATGATGGTGGTGCGGTCCGGCAAGGCGCTATCCAGGGATAACCGGGCAAACA
GACGCATGGAGGCGATTTCGTACAGAGCATCTTCCATCGCGCCATCGCTCAGGTTGTATCCAATGCTGCATGCAGTGAAT
GCGTAGCATGGTTTCCAGCGGAAAAGGTCGCCGGTCACATTACCAGCCTTGGGGTAAAACGGCTCGCTGACTTCCACCAT
GTGTTTTGCCATGGCAGAATCTGCTCCATGCGGGACAAGAAAATCTCTTTTCTGGTCTGAACGGCGCTTACTGCTGAATT
CACTGTCGGCGAAGGTAAGTTGATGACTCATGATGAACCCTGTTACTATGGCTCCAGATGACAAACATGATCTCATATCA
GGGACTTGTTCGCACCTTCCTTAGGTAACATTTAGTTTGGCTAAATGTAAAGATATTGCTGTTTTATTGTTTGTTTTTGC
GAGATGCGCCGCACCATTCCGAAGCAAAATTCTTAAAATGCACTCTTTTAGTGCTACCGCTGGATTACTGTGGTGCAACT
AGGTTGTACTGATGCTGTTTCAGGGTTGCCTTGTATAACAAAGCAATAGATCGTGCCAAAGTTGGATAGGAAATATGTTA
TCCGGATAATGCACTGATGCCGCATCCGGTGAGCGTGGCCGAAATATGGGATGTATTCCGGCACGATAAGAAGGGATTAT
TTACGTCGCTGACGGCAGACTCATCAACACAGCAGCAAAACCAAAACAATGCCGTCAGCACCCACAGTCGGACCAGTTGC
CGAGTACGTGCGTGATGGTGTGAGTTACCGGTGGTCGGCGTACGTTAGTGGTTAACACCTCGCGGGTGAACTGCGGGATC
ATCGCCTGAATTTCTCACCCTGCGGGCCAATCACCGCCGTAATGCCGTTGTTGGTGCTGCGCAACAGTGGGCAGCGCCAG
CTCCAGCGCACGCATTCGCGCCATCTGGAAGTGTTGCCATGGACCAATAGGTTTACCAAACCACGCATCCGTTGGAGATA
GTCAGCAGATAGTCGGTATCCGGGCGGAAGTTATCGCGCACTTGCTCGCCGAGAATGATCTCGTAGCAAATAGCCGCAGT
AAGCTCAATACCATTTGCCGACAGCGGCGGCTGGATATATGGCCCACGGCTGAACGACGACATCGGCAGATCAAAGAACG
GATGCTAACGGACGCAGAATCGACTCAGCGGGACAAACTCGCCAAACGGCACCAGATGGTTTTTGTTATAGCGATCGGCT
GAGTTCGTAGCTGTACGGGCGCACCTTTACCCAGCGTGATGATGGTCGTTGTAGGTATCGTAGCGGTTCTGCTTATTGAT
GACGCGCCTCGACAATCCCGGTTACCAGCGAGCTACCTTTATCACGCAACTCACCGTCCAGTTGCTTTGAGGAACGGTTG
CTGGTTAATTTCCAGATGCGGTTGATCGCCGACTCCGGCCAGATAATCAACGATGATTTGGCTCCATCAGCGGTGCCGTT
GACGTTGTAGTAAATCTTCAGCGTATTAAGAAGCTGGCCTTCGTCCCATTTCAGCGATTGCGGAATATCGCCCTGAACCA
TCGAAACCTGAATGGTTTTCTCCGGTTGTGGGGTAAACCCACTGGATGAACGTCAGACGGGAAGGGAGACGGCAAACATG
CACGACGGCCACCACCAGCTGGACGCCAGTTGCGTTTGACCAACGCCAGTGCCAGCAGGCCACTAACCATCATCAGCAGG
AAGTTAATGGCTTCCACGCCCATTATCGGTGCCAGCCCTTTTAACGGGACCATCAATCTGGCTATAGTCCGAACTTGTAA
CCACGGGTGAAGCCGGTCAAGTACCCAACCGCGCACGAAACTCGGTCACTTGCCAGAGGGCAGGGGGCGGCAATCGCTAC
GCGCAGCCAGGTGGTTTTCGGCCACAGACGCGACAGCACGCCAGCAAACAGTCCGGTATACAGCGACAAATACGCCGCCA
GCTGCACCACCAGGAAGATGTTAACCGGGCCAGACATTCCGCCAAAGGTCGCGATGCTGACATAGACCCAGTTAATACCG
CTGCCAAAGAGGCCAAATCCCCAGCAAAAGGCCAATAGCGGCAGACTGGAGTGGACGGCGGTTAAAGGTCAACGCCTGGC
AAGCCCCATCAGCGAAATAATCCGCCGCAGGCCAGACGTTCGTAAGGAGAGAAGGCCCATGCGCTTCCGCAGGCACCGAA
TAATAACGCCAGCAGCAGGCGAATGCGCTGGCGTTTCAATTACATGAGGCAAAAGCCATGTAAGTATATCTATCCAGTTT
CGGTTTATTCATCCAGCTTCGGCTGGGGTGAGTATCCGGGATTTTGACATGAACCTGAATAATACGCCGACTGTCGGCCG
TCGCCACTTTGAACTGGATAACCGTCGATGTCGATAGTTTCGCCACGCGCCGGAAAGATCCCAAATGCCTGCATCACCAG
ACCACCGATAGTCGTCGACTTCTTCATCGCTAAAGTGGGTGCCGAACGCTTCGTTGAAGTCTTCAATGGAAGCCCAGTGC
GCGTACGGTCCAGGTATGACGACTCAGCTGACGGAAGTCGATATCATCTTCTTCGTCATACTCGTCTTCTAATACTCACG
CAACAATCAGTTCCAGGATGTCTTCAATGGTCACCAGACCGGAAACCCCATCGAATTCGTCAATAACGTATCGCCATGTG
GTAACAGCTGAGAGCGAAACTCTTTCAGCATCCGGCTACGCGCTTACTTTCAGGAAGCGACAACCGCCTGACGTAACACT
TTGTCCATGCTGAAGGCTTCAGCATCGCTGCGCATAAACGGCAGCAAGTTCGTTTCGCCATCAGAATCCCTATCAATGTG
ATCTTTGTCTTCGCTAATCATCCGGGAAGACGTAGAGTGGGCGGACTCGTATAGATGACATCAAGACATTCGGTCCACGC
GTCTGGTTGCGTTTCAGGGTAATCATGCCTGGGAGCGGGGGATCATGATGTCGCGAGACGCGTTGGTCTGCGATGTCCAT
CACCCCTCTCGAGCATATCGCGCGTATCTTCGTCGATATAGGTCGTTCTGCCCGGAATCACGGATCAGCGCCAGCGTTCG
TCACGGTTTTTCGGTTCCACCGTTGGATAAAGTTGGCTGCAGTAACAGGGAGAAAAATCCCCTTTCTTGTTGCTTATCGT
GTCACTACTGTGTGAATTGTCGTCGCTCATGGCGTGTATGGGTTCTCATGTTAGTTAATCAAAACGCCGTCGTTAATCAC
CAACGGCGGGGACGTCTGCCAGTCAAATGCCTGGCAATGTATTCTTTCTCGGCAATGTACGGATCCTCATAGCCCAGAGC
AAGCATAATCTCTGTTTCGAGGGCTTCCATTTCTGTCTGCTTCGTCATCTTCGATGTGATCGCTAACCTAACAAATGCAG
ACTGCCGTGCACCACCATATGCGCCCAGTCGCGCCTCCAGTGGTTTGCCTTGCGTCCTGAGCTTCCTTCTCAACCACTGT
ACGGCAGATAACCAGATCGCCCAGTAGCGACATATTCCAGTGCCAGGCGGCACTTCAAACGGGAAGGAGAGCACGTTGGT
CGGCTTATCCTTACCGCGATAGGTCAGATTCAGACTGTGGCTTTCGGCGGTATCGACCACGCTGAATCGTCACTTCCGAT
TCTCCTGAAACTGCGGGATCACCGCATTCAGCCATGTCTGAAACTGGCTCTCTTCGCGGTAACCCGGAATTATCTTCACA
TGCCAGTGCTAAATCGAGGATCACCTGACTCATTTTTGTTCCTCTGTTCTTCGCGCTTGCTTCTGCTGCCAGCGCCGCTT
TTCGTTTTTGTCTCGGCTTCTTCCCATGGCTTCATAGGCGTTAACGATACTGCGCCACCACAGGGTGACGAACCACGTCT
TCGCTGTGGAAGAAGTTAAAGACTGATCTCTTCGAACATCGGCCAGCACTTCGATGGCGTGACGTAAGCCTGATTTAGTA
TTACGCGGCCAGGTCCGATCTGTGTGACGTCGCCGGTGGATAACCGCTTTTGAGTTAAAACCGATACGGGTCAGGAACAT
CTTCATCTGTTCGATGGTGGTGTTGCTGGCTCTCATCGAGAATGATTAAACGCGTCGTTCAGCGTACGACCACGCATATA
GGCCATGCGGTGCGACTTCAACTAACGTTGCCGCTCAATCAGTTTCTCGACTTTCTCAAAGCCCAGCATTTCAAACAGCG
CGTCGTACAGCGGGCGCAGATACGGGTCTACTTTCTGGCTTAAATCGCCAGGCGAGGAAGCCCAGTTTTCACCGGCTTCT
ACTGCCGGACGAGTCAGCAGAATACGGCGAATTTCCTCGACGCTCCAGGGCATCAACTGCCGCAGCCACTGCCAGGTAGG
TTTTACCCGTACCCGCCGGGCCAACGCCGAAGGTAATGTCATGGGTCGAGAATATTGGCGATGTACTGCGCCTAGGTTTG
CGTGCGCGGCTTAAAATTACGCCGCGTTTGGTTTTGATATTGACCGCTTTGCCGTACTCCGGCACGCTCTCCGCGCTGTC
TGCTCCAGGACCACGCGCTTCTGTAATTCGACAACGGTAGGCAATCTCGTTCCGGTTCGATATCCTGAATCTGACCCGCG
CATCGGGGTCAGTGATCGACATACAGGCTACGCCAGAATGTCTGACCGCAGCGGGACGCAAATCGGACGGCCGTGTCAGT
TTAAAGTGGTTATCGCGGCGATTGATCTCGATGCCGAGACGGCGTTCGAGCTGCTTGATGTTGTCATCAAACGGGCCGCA
CAGGCTCAACAGACGCGCATTGTCTGCTGGCTCCAGGGTTGATTTCGCGAGTGTCTATGTTCAAACCGTCCTCTTATCTG
TATGCCGCCGGAAGCTGAACATTCACCGGCCTATAAGGAAATTATTCACGCCACAGGAAAAAGGCGCAAGCGATTGCAAT
ATAAGATGGGGATAAAGAGAGAAAAAACAAGGCCCGACCGGAACGGCAGGCCTGAGAATTACGGCTGATAATAACCCACG
CCAAAGGTCGTTTTCTTTGACGGGTACGGGCAATCACTGATTCCCGGTGTTTCTGCCACGCGCCAGACCCATTTCATCTT
CAGTACGCACCACTTTACCGCGCAGAGAGTTCGGGTAGACGTCGGTAATTTCTACATCGACGAATTTACCGATCATATCC
GGCGTGCCTTCGAAGTTGACCACGCGGTTATTTTCCGTACGCCCGGAAAGCTCGCATGATGCTCTTCACGCGATGTACCT
TCCTACCAGATATACGCGGGTGGTGCCGAGCATCCGGCGGCTCCACGCCATCGCTTGCTGATTAATTGCGCTCTTGCATG
AATATACAGACGCTGCTTCTTCTCTTCTTCCGGAACATCATCAACCATATCGGCGGCCTGGATGTACCCGGACGTGCAGA
TAAGATAAAGTGTAGCTCATGTCGAAATTGACGTCGGCAATCAGCTTCATCGTTTTTCTCGAAGTCTTCGGTGGTTTCGC
CATGGGAAGCCAACGTATGAAATCAGAACTGATCTGAATATCTGGACGCGCCGCACGCAGTTTACGGATGATCGCTTTGT
ACTCCAGCGCCGTAATGGGTACGGCCCATCAGGTTCAGAATGCAGATCGGAACCGCTCTGTACCGGCAGATGCAGGAAGC
TCACCAGCTCCGGCGTGTCGCGATACACTTCGATGATATCGCTCGGTGAATTCGATACGGATGGCTCGGTGGTAAAGCGA
ATACGATCGATCCCGTCGATCGCAGCAACCAGACGCAGCAGATCGGCAAACGATCCGGTGGTGCCGGTCGTAGTTTTCAC
CACGCCAGGCGTTCACGTTCTGACTCGTAGCAGGTTGACTTCACGCACGCCCTGAGCCGCAAGCTGGTGCATATCTCAAA
CAGAATATCGTCGGACGGACGGCTGACCTCTTCACCACGGGTGTGAAGGCACCACGCAGTAGGTGCAATATTTATTGCAG
CCTTCCATGATGGAGACAAACGCGGTCGGCCCTTCGGCGCGCGGTTCCGGTAGGACGGTCAAACTATCTCGATTTCCGGG
AAGCTGATATCTACAACCGGGCTGCGGTCGCCACGCACGGAGTTGATCATCTGCCGGCAGACGGTGCAGCGTGTTGCGGC
CCAAAAATAATATCGACATCAGTGGGCGCGCTGGCGAATGTGCTCGCCTTCTTGCGATGCCACGCAGCCACCGACGCCGA
TAATCAGGTCTGGATTCTTCTCTTTTAACAGTTTCCAGCGACCTCAACTGATGGAAGACTTTTTCCTGAGCCTTCTCGCG
GGTTGAGCAGGTGATTCAGCAGCAGCACATCCGCTTTCTGTCCGCCTACGTCGGTCAGTTGATAGCCGTGGGTGGCATCC
ACAGATCGGCCATCTTCGAATGAATCGTACTCGTTCATCTGACAGCCCCAGGTTTTAATATGGAGTTTTTTGGGTCATCG
ACTTGCTCTTGCGAAATAGTAGCCAGGAATGCAGGGCGTCATAGTGTAATGCTTTGCTGACCGTTGTGACCAGTATGAGC
GTTATCAGCCCTTAGGGGTAAAAATCCTGTAAACTTAAAGCAGTATTGCTAACAGGATGATTGACCATGACAAATCAACC
AACGGAAATTGCCATTGTCGGCGGAGGAATGGTCGGCGGCGCACTGGCGCTGGGGCTGGCACAGCACGGATTTGCGGTAA
CGGTGATCGGAGCACGCAGAACCAGCGCCGTTTGTCGCTGATAGCCAACGGACGTCGGATCTCGGCGATCAGCGCGGCTT
CGGTATACATTGCTTAAAGGGTTAGGGTCTGGGATGCAGTACAGGCTATGCGTTGCCATCCTTACCGCAGACTGGAAACG
TGGGGAGTGGGAAACGGCGCATGTGGTGTTTGACGCCGCTTGAACTTAAGCTACCGCTGCTTGGCTATATGGTGGAAAAC
ACTGTCCTGCAACAGGCGTTGTGGCAGGCGCTGGAAGCCGCATCCGAAAGTAACGTTATCGTCGTGCCAGGCTCGCTGAT
TGCGCTGCATCGCCATGATGATCTTCAGGAGCTGGAGACTGAAAGGCGGGAAGTGATTTCGCGCGAAGCTGGTGATTGGT
ACCGACGGCGCAAATTCGCAGGTGCGGCAGATGGCGGGAATTGGCGTTCATGACATGGCAGTATGGCGCAGTCGTGCATG
GTTGATTAGCGTCCAGTGCGAGAACGATCCCGGCGACAGCACCTGGCAGCAATTTACTCCGGACGGGACCGCGTGCGTTT
CTGCCGTTGTTTGATAACTGGCATACGCTTAGGGATTGGGTATGTGACTCTGCCCGGCGTCGTGTATGTCGCCAGTTGCA
GAATATGAGTATGCGCACAGCTCCAGAGCGGAAATCGCGAAGCATTTCCCGTCGCGTTCTGGGTTACGTTACACCGCTTG
TCCGCTGGTGCGTTTCCGCTGACGCGTCGCCATGCGTAGCAGTACAGTGCAGCAGGGCTTGCGCTGGTGGGCGATGCCGC
GCATACCATCCATCCGCTGGCGGGGCAGGGAGTGAATCTTGGTTATTCGTGATGTCGATGCCCTGACTTGATGTTCTGGT
CAACGCCCGCAGCTACGGCGAAGCGTGGGCCAGTTATCACTGTCCTGCAAGCGGTACCAGATGCGGCGCATGGCGGATAA
CTTCATTATGCAAAGCGGTATGGATCTGTTTATGCACGGATTCAGCAATAATCTGCCACCACTGCGTTTTATGCGTAATC
TACGGGTTAATGGCGGCGGAGCGTGCTGGCGTGTTGAAACGTCAGGCGCTGAAATATGCGTTAAGGGTTGTAGCCTTACA
ACATTGCCGGGATGACGTGCCTAACCGTAGGTCGGATAAGACGCGGCAGCGTCGCATCCGACATTGAAGGATAAGACGTG
TCAACGATCGCATTCGACATTGAATGAACGCAGAAAAGCAAAAAGGCTCGCCAGAAGCGAGCTTTTTTAATGTGGCTGGG
GTACGAGGATTCGAACCTCGGAATGCCGGAATCAGAATCCCGTGCCTTACCGCTTGGCGATACCCCAACTGGGTGCACTT
AACTAAGGTAAGCGTCTTGACATAAATTGGCTGGGGTAGCGAGGATTCGAACCTCGGAATGCCGGAATCAGAATCCGGTG
CCTTACACGCTTGGCGATACCCCAACAAATTGGTTTTGAATTTGCCGAACATATTCGATACATTCAGAATTTGGTGGCTA
CGACGGGATTCGAACCTGTGACCCCATCATTATGAGTGATGTGCATCTAACCAACTGACGCTATCGTAGCCAGATTGTTT
CTTCGATGGCTGGGGTACCTGGATTCGAACCAGGGAATGCCGGTATCAAAAACCGGGTGCCTTACCGCTTGGCGATACCC
CAATAACCGGGTCGGTGAACCGCTTACTCGAAGAAGATGGCTGGGGTACCTGGATTCGAACAGGGAATGCCCGGTATCAA
AAACCGGTGCCTTACCGCTTGGCGATACCCCATCCGGTACAACGCTTTCGTGGTGAATGGTGCGGAGAGGCGAGACTTGG
AACTCGCACACCTTGCGGGCGCCAGAACCTAAATCTGGTGCGTCTACTCAATTTCGCCACTCCCGCAAAAAAAAGATGTG
TGGCTACGACGGGATTCGAACTGTGCACGCCCACCATTATGAGTGATGTGCTCTAACCAACTGAGCTACGTAGCCATCTT
TTTTTTCGCGATACCTTATCGGCGTTGCGGGGGCGCGATTATGCGTCGTAGAGCCTTAGCAGTCGTCAACCGTCTTTTTC
AAGGAAAATTGCTCGAAAGTGACTGTTTGGTTAGGTTGGAACAGCGTGGCGCTATATTCGTCAATTATTGTTTACTTTGT
GTTTGTTTCCAACCCTACAGCCCATTCTTTTGTCATACAGGATGAAATTCGGAATTTAACAATAGTGGTGGTGAAATTAA
TCTATGAAATACTGGCCTACAGTGGATGAGTTGTCAAACAGTGATGTGGCAAACCCGGAACATTTCCTTACTGCATATCC
AGAATCAACAAGCTACCTCAATAACTGTAAACAGCCCCGGATTTCACCGGGGCTGTTTCGCATTTCTTACTTATACGCCG
ACTGAGTGAACCACCAACCGCGCGACCAGACGGATCGTCCATTTTCTTGAACGCTTTCATCCCATTCGACTCGCTTTAGC
GGTAAGAACAAGCGACGGAAGCGGACGCCCGGCACGCACTCAGCGGCGCTCGGAAGCGGGAATAGTCTTCAAAGATCTCC
CGATACAAGTACGCTTCTTTAGAGGTTGGCGGTGTTGTACGGGAAAGCGGAAGCGCGGCAGTTTCGCAGTTGCTGATCAG
AAACCTTGCTGCGCAGCCACTTCTTTCAGGGTGTCGATCCATACTGTAACCAGACGCCATCGGGAGAACTGCTCTTTCTG
CCGCCAGGCCAGCTTGCAGGCAGATACGCTTCAAAACATTCACGCAGGATGTGTTTTTCCATTTTGCCGTTACCGCACAT
TTTATCCTGTGGGTTAATACGCATCAGCCACATCAAGGAATTTGTTTGTCGAGGAACGGAACGCGTGCTTCCACGCACCG
CAGGCTGACTCGCTTTGTTGGCACGCGCGCAGTCATACATATGCAAGGGCCAGCAGTTTACGCACCGTCCTCCTCATGCA
GTTCTTTGGCATTCGGGGCTTTGTGGAAGTAAAGATAACCGCCGAACACTTTCATCAGCAACCTTCACCGGACAGCACCA
TTTTAATGCCCATCGCCTTTGATCTTACGCGACATTAAATACATCGGTGTTGAAGCGCGAATAGTGGTCACATCATAAGG
TTTCGATAGTGGTAAATCACGTACGCGGATGGCATCCAGACCTTCCTGTACAGTGAAGTGAATTTCGTGATGCACCGTGC
CCAGATGGTTTGCCACTTCCTGGGCTGCTTTCAGATCCGGTGAACCCGGCAGACCTACCAGCAAAGGTAGTGTAACTGCC
GGCCACCAGGGCTTCAGAGCGTTCCTGATCTTGCCACGCGACGGGCGCGTATTTCTTGGTGATAGCGGAAATAATTGAGG
AATCCAGACCACCAGAAAGCAGCACACGCGTAAGGCACACAGACATCAGATGGCCTTTTAACTGAATCTTCCAGTGCCGA
AGCACTCGTTTTTGTCAGGTCACGTTATCTTTCACCGCTATCGTAGTCGAACCAAGTCGGCGATGATAGTAAGAACGGAT
TTCGCCGTCCTGCGCTCCACAAATAGCTCCCCGCCGGGAACTCTTTAATCGTGCGGCAAACTGGCACCAGCGCTTTCATT
TCTGAGGCCACATACAGCTGACCGTGTTCGTCATACCCCATACACAGTGGGATGATCCCCAGATGCGTCGCGACCAATCA
GGTAGGCATCTTTTTCGCTCGTCGTACAGTGCAAAGGCAAACATGCCCTGCAAGTCGTCGAGAAATTCCGGCCCTTTCTT
CCTGATACAGCGCGAGGATCACTTCACAGTCAGACCCGGTCTGGAACTGGTAACGAATCGCCATATTCGGCGCGCGAATG
CCTGGTGGTTGTAGATTTCACCGTTTACCTGCCAGTACGTGGGTTTTTTGTTAGGTTGTATGAGAGGTTGCGCCCCCGCG
TTAACGTCAACAATTGACAACCGTTCGTGGGCGAGAATGGCGTTATCGCTGGCATAAATACCGTGACCAGTCCGGGCCAC
GATGACGCATGCAGGCGTGACAGCTCGAGGGGCTTTCTTACGCAGCTAAGTGCGTCTGTTTTGATATCAGAATAGCGCCA
AAAATTGAACACATAACCTTCTCCGTTAACCTGGTATTTGTTGCTTGTTGTGTTTGCTTGTTTAAAAAAATGCCGCAAAG
CAGCACTGTGCGCAGTCCGATTTGGATGGGTGAAAAAATAAAGAAAAAGTAATTGGATAGACTCTTGTGGATTTGGTGCA
TAAAAAGGTCTGGTGTGAGGATATATTTATTGATTGAATCGATAATTTTTAGCGGGTTTTATTGAATGTTATATTTTACT
TGGGGGCCAAATTTGCTGACAAAGTGCGAGTTTGTTCATGCCGGAATGCGGCGTGAACGCCTTATCCGGCCACAAAAGGC
ATGAAAATTCAATATATTAGCAGGAGCTGCGTAGGCCGTGATAAGCGAGCGCCATCAGGCAGTTTGGCGTTTAGTCATCA
GAGCCAACCACGTCCGCAGACGTGGTTGCTATTCGAAACGTCGATTTCAGCGACTGACCGGGTAAATCCAGCTGGGGCGA
AAAGGCATACCTGTCGATATCGTCGAGCGACGAAACACCAGAATGCACCAGAATCGTCTCCAGACCTGCCTGGAAGCCGG
CCAGAATACGGTACGCAGGTTATCGCCGACAATCACCGTTTCTATCCGAATGCGCCTGCATTATGGTTTAATGCTGCGCG
GATGATCCACGGGCTGGGCTTACAAACATAAGAACGGTTCTGCGCCCGGAAGATTTTCTCAATCCCTGCACAACAACGCG
CGCACAAGCGGGATAAAAACCGCGCGCCGTGGGTGATCCGGATTGGTGGCGATAAAACGTGCACCGTTAGCGACGAAATA
GGCTGCTTTATGCAATCATGTCCCAGTTGTAGGAACCGCGTTTCGCCCAACAATCACGAAAATCAAGGGTTCACATCGGT
AATAGTGAAACCGAGCTTTGTACAGTTCATGAATCAGTGCGCCTTCGCCCACCACATACGCTTTTCTTGCCTTCCTGGCG
ACTGGAGGAATCGTGCAGTCGCCATCGGCAGAGGTATAAAACACGCTGTGCAGGTACATCGACACCTGCGGTGGCAAAGC
GGTTCGCCACGATCTTGCCCAGTCTGCGAAGGATAGTTGGTCTAGCGAACAGCAGCGGCAGGCCTTTATCCATAATCCC

Best, German

gt1 avatar Aug 02 '17 07:08 gt1

It is not recommended to generate CIGAR for read overlapping because 1) generating CIGAR for every overlap is very slow; 2) it is usually not necessary to have cigar for read overlapping. SAM is also the wrong format for read overlapping. No read overlappers output SAM.

On your example, minimap/minimap2 ignore anchors with the same position if the read name is the same, so you shouldn't see perfect alignment. However, with CIGAR on, alignment extension may still produce a nearly perfect alignment from different seeds. This is a problem with minimap2, which I will try to fix at some point. Thanks.

lh3 avatar Aug 02 '17 13:08 lh3