swarm
swarm copied to clipboard
Adapt to AVX2
Adapt SWARM to AVX2 and the 256-bit registers available in the new Intel Haswell CPUs that became available in June 2013. Should allow 32-way SIMD parallellisation.
Considering the relatively low number of computers in use with AVX2 and the rather limited time now spend on computing alignments, this performance improvement with AVX2 is currently rather limited.
Or use https://github.com/RonnySoak/libssa
Parasail implements SIMD parallelisation, including using AVX2. https://github.com/jeffdaily/parasail/
It's also very fast https://github.com/jeffdaily/parasail/blob/2967c065de02fc2dc7e44050b4fc6ecfa064b2dc/performance.md
Thanks @colinbrislawn, I didn't know about parasail.
Parasail looks interesting and flexible. A master student of mine have been working on another solution which I think is generally faster: https://github.com/RonnySoak/libssa His master thesis will soon be available. We hope to integrate his code in Swarm and VSEARCH.
Edlib is a very fast Levenshtein distance library, suitable for long sequences (> 1 Mb).
What are we going to do about this issue?
Considering the relatively low number of computers in use with AVX2 and the rather limited time now spend on computing alignments, this performance improvement with AVX2 is currently rather limited.
Now that AVX2 capable CPUs are more frequent, do you think it would be interesting to implement this? It is difficult for me to access the amount of work necessary, but I imagine that work could be re-used to speed-up vsearch's search function.
Recent paper on pairwise alignment. This paper introduces Scrooge and compares it to Edlib and to GenASM.
My first though was, 'How does this compare to the wavefront alignment algorithm (WFA2-lib)?' and they are already discussing that in this issue!
🌊 🚀🧬
Indeed, Wavefront is a very interesting alternative. There is even a GPU version now: WFA-GPU (Aguado-Puig et al. 2023).