Memory usage
Hi, spoa is a fantastic tool, but the memory usage is a bit high. Is there any way to adjust the parameters to make it smaller?
Hi Ilia, unfortunately the memory complexity is quadratic (O(graph_length * sequence_length)). We might add banded alignment which should facilitate the memory consumption. If your input is huge, you could try slicing your input into chunks.
Best regards, Robert
Thanks, banded alignment would be very cool to have.
It should be possible to use the dozeu x-drop aligner to do this. That would resolve the quadratic memory issue.
Alignment would have to be run in phases, because the x-drop parameter requires that the alignment starts where there is a solid match with the target graph. The cycle would be to scan for the first hit, then align until breakage, then scan again until the next hit.
@ekg, does this approach guarantee optimal alignments?
@rvaser no, I don't think we can guarantee optimality without evaluating the full matrix, and this will only evaluate a subset that falls within the limits of the x-drop and scoring parameters. Furthermore, deciding where to start the process is heuristic, and would have to be based on some kind of seeding.
Thanks for the clarification, I'll think about if the scope of changes is worth exploring.