GenomeWorks
GenomeWorks copied to clipboard
[cudamapper] Filtering for self mappings, identical overlaps, and highly-similar overlaps
This PR brings in filtering strategies that mimic those of minimap2.
- [X] Self mappings: minimap2 does not output self mappings (where an overlap between a read and itself covers more than some very high fraction of that read's length). This can be disabled with a command line flag (
-S
). - [x] Identical overlaps: remove identical overlaps within some number of indices.
- [x] Highly-similar overlaps: remove overlaps that have a percent similarity (defined as the reciprocal overlap) higher than some percentage threshold.