minimap2
minimap2 copied to clipboard
An option to output SEQ field for secondary alignment
Hi!
I have been using minimap2 as a part of the Flye pipeline. I am using secondary alignments during consensus/polishing to account for possible duplications in disjointigs and improve the base quality of long repeats. Storing SEQ field for secondary alignments in SAM/BAM files makes the alignment file parsing much easier, since all separate alignments could be processed independently.
An existing -Y
option solves the problem, but it also forces supplementary alignments to use soft clipping, which in some datasets dramatically increases alignment size. I therefore added a new option -secondary-seq
that enables output of SEQ for secondary alignments, and uses hard clipping for both supplementary and secondary alignments.
The option was extensively tested as a part of the Flye pipeline for almost a year. I however have not tested outside of genome assembly setting.
Best, Mikhail
It would be really great if this patch could be applied to enable building flye smoothly.
Hi all,
Will be happy to make this pull up-to-date with the master, once you confirm that you are interested in adopting. The conflict is due to new command line options, so it is an easy fix.
@fenderglass, I don't know how the pull works. But, I just posted the same request here. It would be great if sequences for secondary mapping can be turned on. Thanks!
just saw this PR but I made a program that tries to help add SEQ to secondary alignments as a post-processing step though this PR would be interesting too! https://github.com/cmdcolin/secondary_rewriter