Jovian_archive
Jovian_archive copied to clipboard
Enhancement: scaffolds ends are sometimes of poor quality
Problem:
- Scaffolds 5' and 3' ends are often of poor quality. Sometimes we see the RNA/DNA seems to fold in on itself (hairpin) and generates a reverse-complement sequence (could be from an enzymatic step). Nevertheless, it doesn't "belong" in the scaffold.
- At high read coverage the assembly breaks down into multiple smaller contigs (when downsampling you do get the complete genome in one contig however).
Possible solutions:
In general, we must identify the problem by looking in the assembly graph via bandage
or AGB
(https://github.com/almiheenko/AGB). If this is useful we might add AGB
to the report toolkit? It's a online assembly graph viewer.
- Find a good matching full genome reference and identify the reads belonging to that 5' and 3' gap/erroneous region and update the scaffold with it?
- Artificially normalize your coverage via bbtools/bbnorm (http://seqanswers.com/forums/showthread.php?t=49763)?
- Optimize the
SPAdes
parameters?