NOVOPlasty icon indicating copy to clipboard operation
NOVOPlasty copied to clipboard

General questions

Open jhcaddisfly opened this issue 5 years ago • 5 comments

I was running Novoplasty to assemble a mitogenome on adapter trimmed Illumina reads and got the following results: Contigs_1_PH2.fasta

What does it mean? Can I use the contig for annotation and further analysis? I am just wondering why the contig is not circular. Also, is there a way to do some kind of quality check afterwards? Is backmapping of the reads to the mitogenome necessary to confirm per base coverage? Thanks,

Assembly 1 finished: Contigs are automatically merged in Merged_contigs file------------

Contig 01 : 17031 bp

Total contigs : 1 Largest contig : 17031 bp Smallest contig : 17031 bp Average insert size : 270 bp -----------------------------------------Input data metrics-----------------------------------------

Total reads : 488365208 Aligned reads : 3298238 Assembled reads : 2942940 Average organelle coverage : 29049

jhcaddisfly avatar Jan 31 '20 14:01 jhcaddisfly

Can you run it again with extended log option set to 1 and send me that file? Then I can see if it is possible to extend.

ndierckx avatar Jan 31 '20 15:01 ndierckx

I send the extended log file to you via email. Thanks,

j.

jhcaddisfly avatar Feb 02 '20 20:02 jhcaddisfly

Hi,

Sorry I was away. That's a big log :) It didn't circularize because you have a long repetitive region, both sides end up in that region, so it seems the assembly is as good as complete. There is overlap to circularize, but it won't do it automatically because it is impossible to know how many repeats there are with short reads. This region is usually the control region and not a coding region so not that important to have it completely. But it could be complete already but to know for sure you would need long reads..

ndierckx avatar Feb 10 '20 16:02 ndierckx

Hi,

thank you very much for your reply. I would have some further questions.

  1. --> you have a long repetitive region, both sides end up in that region: How could you see this from the log file?
  2. Actually, I have long reads now, So would mapping the long reads to the short reads help to complete the mt genome?

Thanks

jhcaddisfly avatar Feb 20 '20 16:02 jhcaddisfly

  1. You can see it in the contigs, just check both ends in the fasta file. Sometimes you can see it in the merged file, but only when it is able to jump over that region and continue the assembly

  2. Yeah of course, it would map the m to the current assembly and see if any overlap the end and start of the illumina assembly

ndierckx avatar Feb 21 '20 16:02 ndierckx