Roary
Roary copied to clipboard
Roary for metagenomes
I've read in the documentation that "Roary is not intended for meta-genomics or for comparing extremely diverse sets of genomes". Can you please explain why? What are the drawbacks of using it on gene calling from metagenomic data?
Roary is fast because it expects lots of very similar proteins and uses cd-hit
to speed that part up. After that it falls back to ALL vs ALL blastp
. Metagenomes have lots of genes, let's say you have N. Then roary will take N x N
time to run. It will never finish. Consider other tools like proteinortho, MMseqs2, cd-hit directly.
Thanks Torsten! So is it just a matter of time? I've actually planned to execute Roary on reconstructed bins of the same species (is there any meaning for pangenome analysis for different species?) which I assume will have similar number of genes as an isolate genome.