Roary
Roary copied to clipboard
Core genome alignment failure
Hi @tseemann, I see you posting often on this github, so thought I'd ask!
I'm having some issues getting a core genome alignment of around 300 full-length E.coli sequences w/ ROARY.
The program runs as it should and outputs all of the expected files, however I often get issues with the core_genome alignment being blank following the warning:
--------------------- WARNING --------------------- MSG: Got a sequence without letters. Could not guess alphabet
The summary statistics show
Core genes (99% <= strains <= 100%) 0 Soft core genes (95% <= strains < 99%) 69 Shell genes (15% <= strains < 95%) 11410 Cloud genes (0% <= strains < 15%) 99008 Total genes (0% <= strains <= 100%) 110487
I've checked the gene_presence_absence.Rtab and it looks ok to me, and I can see no obvious contamination.
Could you speculate what the issue may be?
Best wishes, Steve
Small update, I reran this after much more carefully QC'ing the files.
Now:
Core genes (99% <= strains <= 100%) 126 Soft core genes (95% <= strains < 99%) 231 Shell genes (15% <= strains < 95%) 3745 Cloud genes (0% <= strains < 15%) 16947 Total genes (0% <= strains <= 100%) 21049