Flye
Flye copied to clipboard
Missing & misassembled plasmids
Howdy,
My name is Per and I would first like to thank you for developing Flye! Anyway, I have been using Flye to assemble an E. Coli isolate from ONT data. I think that Flye is having 2 issues properly assembling plasmids from this isolate. I would greatly appreciate some help troubleshooting these issues!
Issue No. 1: Not assembling small (<~2 kb)) plasmids.
The final assembly info (Table 1) shows that Flye is assembling 2 plsamids and the main bacterial chromosome.
#seq_name | length | cov. | circ. | repeat | mult. | alt_group | graph_path |
---|---|---|---|---|---|---|---|
contig_1 | 5072618 | 137 | Y | N | 1 | * | 1 |
contig_3 | 122045 | 205 | Y | N | 2 | * | 3 |
contig_2 | 8128 | 871 | Y | Y | 7 | * | 2 |
Table 1. Final assembly info for isolate.
However, based on NanoPlots (Fig 1) and gels produced from plasmid preps, we expect there to be 3 smaller plasmids.
Fig 1. NanoPlot of reads used as input for isolate assembly
Issue No. 2: Missassembly of a plasmid
While Flye assembled an 8 kb contig, there isn't a corresponding peak in the NanoPlot (Fig 1.). Further examination of the alignments profile of reads (Fig 2) and low number of supplemental alignments suggests to me that this is a repeat that has been duplicated in tandem and then circularized.
Fig 2. IGV screenshot of reads used for assembly mapped back to contig_2. I sorted alignments by tag (SA) with MAPQ set to 0.
There are a few reads that span this 8kb region (Fig 3) so I suspect that this is sequence duplication exists somewhere else in this E. coli genome but not as an 8 kb plasmid.
Fig 3. IGV screenshot of reads used for assembly mapped back to contig_2. I sorted alignments by length with MAPQ set to 30.
Logs
Here are logs for 2 seperate attempts to assembly this genome. I saw that you previously addressed this issue with another user by suggesting the use of the meta tag, so I tried that, but it did not help.
Best, Per