Roary icon indicating copy to clipboard operation
Roary copied to clipboard

Missing capability of "synteny" + "splitting paralogs" ?

Open tseemann opened this issue 6 years ago • 3 comments

We had recent bioinformatics group meeting and it became apparent that there may be some missing functionality in roary with respect to its use of synteny and its treatment of paralogs.

The situation is S.pyogenes which has 100+ gene duplications, and is also recombinant and rearranged. By default it gets too many clusters, possibly because of "synteny enforcement". If you use -s it removes synteny (?) but also forces paralogs into a single cluster.

I think what was wanted was a way to keep paralogs separate and use synteny still?

Does this make sense at all?

P.S. One way I thought was to pre-process the GFF files so each CDS was put into its own contig, therefore removing any synteny (too many clusters) but not forcing -s use.

tseemann avatar Apr 24 '18 04:04 tseemann

For this, as a suggestion, you may test to use get_homologues. I use to use and it works in this sense. Any other suggestion?

felipelira avatar Jul 27 '18 10:07 felipelira

@felipelira what is get_homologues ?

tseemann avatar Aug 25 '18 07:08 tseemann

@tseemann I think felipelira was referring to this. https://github.com/eead-csic-compbio/get_homologues

cwbcm avatar Nov 22 '19 20:11 cwbcm