BandageNG
BandageNG copied to clipboard
Minimap2 support
It's really great that Bandage gets a new life - thank you for taking that up.
As a possible enhancement, I suggest minimap2
support. Currently you can map the assembly using blast
and display those super useful rainbow plots, but blast
is getting a bit tired these days.
If you add this support, it would be useful to allow both mapping on the fly (like currently for Blast), as well as the use of a preexisting paf
file written by minimap2
.
Hello
Yes, we thought about this. Fortunately, there is a workaround already! Convert PAF into BED and load BED.
Can you do rainbow plots from BED? if so, that is a reasonable temporary solution until you can implement native
minimap2
support.
Just adding my vote:
Considering that you need to do some significant logistics to make this work, I would also love to have a fully integrated mapper that works for long sequences.
When using Bandage as a browser, it's convenient to map contigs or "Regions of Interest" on the fly, and converting GFA to FASTA, switching to the command line, then doing PAF->BED for each query is a bit cumbersome.
@paoloczi Ranbow scheme does not make much sense for BED as there is no "query" there, yes.
@rlorigro Is BLAST not sensitive enough, or what are the problems that could be better solved using minimap2 in your case?
In my experience, Blast tends to be very slow, and it often breaks up mappings in undesirable ways. minimap2
offers much better control.
Where it fails is particularly in longer mappings/alignments. Sometimes it will just never finish running. The case where long mappings is useful is if you want to map a nanopore read to an assembly, for example, or you want to find a repetitive region of interest using a reference excerpt, which needs to be long to map accurately, or span duplicates of a gene.
Here is the teaser. While UI shows BLAST
it's only the UI. The reality is minimap2 under the hood aligning C4 sequences to C4A/C4B graph from https://zenodo.org/record/6617246

The second one is a bit more extreme as we're aligning the whole MHC. I would not try this with BLAST :
Still, I would suggest specialized graph aligning tools like GraphAligner, PathRacer or SPAligner for sequence / HMM-to graph aligning :)
out of curiosity, did you use minimap as an API or by calling it as an executable and using IO?
out of curiosity, did you use minimap as an API or by calling it as an executable and using IO?
Binary is executed and final PAF is parsed. Essentially, the existing BLAST support was heavily refactored and generalized to allow different ways to obtain "query hits". Regardless of the way how they are obtained :)
One need to know though that hit-combining and path-building approaches in Bandage are essentially brute-force :(
For minimap isnt that more trivial because you have a "primary" alignment and you can just follow all the supplementaries in order of their query coordinates?
Well, not quite :) Overall the problem is similar to that we're having in graph aligning: we have a set of "seed alignments" to the nodes of the graph and we need to chain them properly. Note that some seeds could be missed, some could be misplaced (think about alignment through repetitive region of the graph where multiple hits of the same query region is possible, etc.).
So, should the proper path be required, other tools have to be used. The ones that do proper graph alignment. After all, we do support GAF loading these days as well as other formats
Hmm ok, well I agree that chaining should not really be in the scope of bandage
It won't matter for assembly graphs, but aligning to the paths will be much more sensitive in the case of variation graphs. The alignments can be made against the path sequences of the graph and then injected into the node space.
On Wed, Aug 10, 2022, 00:29 Ryan Lorig-Roach @.***> wrote:
Hmm ok, well I agree that chaining should not really be in the scope of bandage
— Reply to this email directly, view it on GitHub https://github.com/asl/BandageNG/issues/44#issuecomment-1209954989, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQEJ42BZZNIT55QZICADVYLLULANCNFSM5XVRW66Q . You are receiving this because you are subscribed to this thread.Message ID: @.***>
@ekg Actually it does matter if we need to span a complex repeat :) E.g. in PathRacer we have an option of seeding from paths not nodes and it makes huge difference in complex repetitive regions.
I also thought about adding alignment to the paths. But likely I'd rewrite chaining to something a bit more efficient (currently there is a limit of 6 nodes in the "query path" to limit the combinatorial explosion of the bruteforce approach :) )
@ekg FWIW https://github.com/asl/BandageNG/pull/114 implements alignment to the paths, query paths are automatically built out of the,