cobra
cobra copied to clipboard
Interrogation regarding overlap-based assemblers
Thank you for the great tool! I am using MEGAHIT+VIBRANT+COBRA to get about 35 high-quality phages from each of my metagenomes after validation by CheckV. I just tested the new overlap-based assembler PenguiN (https://github.com/soedinglab/plass) and for the same samples using PenguiN+VIBRANT gives about 180 HQ phages so about 5 times more.
I am wondering how you would compare the 2 tools considering they are both based on overlap-based assemblies. What drives the high difference in my understanding is that in case of conflicted extension COBRA would often stop whereas PenguiN would use a Bayesian rule to find the best extension out of the several possibilities. I do not understand how such a liberal approach can avoid misassemblies though.
Do you have an opinion on that?