abPOA
abPOA copied to clipboard
Homopolymer indels not consistently aligned
Hi, I am trying to get a reasonable alignment in a region which has some tandem repeats, flanked by non-repetitive sequence. I can get good (enough) results in the tandem region using these parameters:
abpoa \
-n 10 \
--progressive \
--amb-strand \
-b 1000 \
-r 1 \
However, in the (mostly non-repetitive) flanking region there is a long homopolymer, where I get this result:
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC--------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGTCTGGGCAACATAGTGAGACATTGTCTCTAC------------------AAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGTCTGGGCAACATAGTGAGACATTGTCTCTAC------------------AAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTACA-------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC--------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC--------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC---------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC---------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTACA-------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTACA-------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC--------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC--------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC--------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC--------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC----------------AAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC---------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC---------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC--------------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC----AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC--------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC--------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC---------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC---------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC---------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC---------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC--------------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC---------------AAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC---------------AAAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC----------------AAAAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTA-------------------AAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTA-------------------AAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCAGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTA-------------------AAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCAGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTA-------------------AAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTA-------------------AAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCAGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTA-------------------AAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------AAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCAGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------AAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------AAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------AAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------AAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTA-------------------AAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTA-------------------AAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTA-------------------AAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------AAAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC----------------------AAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC----------------------AAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC----------------------AAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------ACAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------ACAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------ACAAAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------AC-AAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------AC-AAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------AC-AAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------AC-AAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------AC-AAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------AC-AAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------AC-AAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------AC-AAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------AC-AAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------AC-AAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------AC-AAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------AC-AAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------AC-AAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCT--------------------AC-AAAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC----------------------AAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC----------------------AAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC----------------------AAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC----------------------AAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC----------------------AAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC----------------------AAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC----------------------AAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC----------------------AAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC----------------------AAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC------------------------AAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC----------------------AAAAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
TTCAAGACCAGCCTGGGCAACATAGTGAGACATTGTCTCTAC------------------------AAAAAAAAAAAAAAAAAAAACACAAAATTAGTCGGGTGTGGTGGTGCC
Where it seems to arbitrarily assign different paths to the same AC
prefix. Do you think this can be resolved with parameter choices or is this an unavoidable aspect of POA?
Thanks