Jovian_archive icon indicating copy to clipboard operation
Jovian_archive copied to clipboard

Enhancement: scaffolds ends are sometimes of poor quality

Open DennisSchmitz opened this issue 5 years ago • 0 comments

Problem:

  1. Scaffolds 5' and 3' ends are often of poor quality. Sometimes we see the RNA/DNA seems to fold in on itself (hairpin) and generates a reverse-complement sequence (could be from an enzymatic step). Nevertheless, it doesn't "belong" in the scaffold.
  2. At high read coverage the assembly breaks down into multiple smaller contigs (when downsampling you do get the complete genome in one contig however).

Possible solutions:

In general, we must identify the problem by looking in the assembly graph via bandage or AGB (https://github.com/almiheenko/AGB). If this is useful we might add AGB to the report toolkit? It's a online assembly graph viewer.

  • Find a good matching full genome reference and identify the reads belonging to that 5' and 3' gap/erroneous region and update the scaffold with it?
  • Artificially normalize your coverage via bbtools/bbnorm (http://seqanswers.com/forums/showthread.php?t=49763)?
  • Optimize the SPAdes parameters?

DennisSchmitz avatar Oct 18 '19 14:10 DennisSchmitz