Roary
Roary copied to clipboard
Missing capability of "synteny" + "splitting paralogs" ?
We had recent bioinformatics group meeting and it became apparent that there may be some missing functionality in roary
with respect to its use of synteny and its treatment of paralogs.
The situation is S.pyogenes which has 100+ gene duplications, and is also recombinant and rearranged. By default it gets too many clusters, possibly because of "synteny enforcement". If you use -s
it removes synteny (?) but also forces paralogs into a single cluster.
I think what was wanted was a way to keep paralogs separate and use synteny still?
Does this make sense at all?
P.S. One way I thought was to pre-process the GFF files so each CDS was put into its own contig, therefore removing any synteny (too many clusters) but not forcing -s
use.
For this, as a suggestion, you may test to use get_homologues. I use to use and it works in this sense. Any other suggestion?
@felipelira what is get_homologues
?
@tseemann I think felipelira was referring to this. https://github.com/eead-csic-compbio/get_homologues