NOVOPlasty icon indicating copy to clipboard operation
NOVOPlasty copied to clipboard

Contig result is mirrored

Open reinator opened this issue 5 years ago • 14 comments

There is a problem with these recent versions where the contig result present a mirroring problem, like the image above: image

Clearly the selected region is mirroring with the other region. And when we separate the regions and assemble them, it is confirmed. image

Please, fix this. NovoPlasty is a very good tool and I'm looking forward to cite it in my researches.

Best regards.

reinator avatar Aug 09 '18 17:08 reinator

Hi,

Don't really understand the problem, which version do you use? And can you send me the log file?

ndierckx avatar Aug 09 '18 17:08 ndierckx

I use the version 2.7.1. I will run on the version 2.7.2 to see if the problem continues.

But just to explain, imagine that the result should be: ACGTAAAAAACCCCCCC

Instead, NovoPlasty is returning: ACGTAAAAAACCCCCCCGGGGGGGTTTTTTACGT

where clearly the bold sequence is mirrored (reverse complement) with the normal sequence.

I have already seen situations like that in other executions with other samples, and since I saw that the problem remained in the version 2.7.1, I decided to create this issue.

reinator avatar Aug 09 '18 20:08 reinator

Are you sure it is an assembly mistake? These genetic structures can occur in mitochondrial genomes

ndierckx avatar Aug 09 '18 22:08 ndierckx

Yes, I'm sure. I choose the option to save the assembled reads and when I use them in other assembler, like SPAdes, this assembly mistake do not happen.

reinator avatar Aug 10 '18 14:08 reinator

The example that I gave in the last post is a small one, but the situation showed in the figures that I sent are of 10000 nucleotides.

reinator avatar Aug 10 '18 14:08 reinator

I never saw this problem before, so could you send me the extended log to check if it is a mistake? Maybe you will have to run again (with that option to 1)

ndierckx avatar Aug 10 '18 14:08 ndierckx

I ran again with the extended log enabled, which is attached. log_extended_mito_k39.txt

reinator avatar Aug 29 '18 19:08 reinator

Sorry I didn't had time to reply. I will look at it better tomorrow, I think there is indeed a mistake during the repetitive region, but it just repeats everything (not mirrored?)

ndierckx avatar Sep 06 '18 01:09 ndierckx

It repeats and generates the reverse complement of the inversion, which I called mirroring. If you see the images that I firts sent, the idea of mirroring gets clearer.

If you need further information regarding this issue, I could send you the contig, via email. Thanks in advance.

reinator avatar Sep 06 '18 16:09 reinator

I can get your contig out of the extended log so maybe you want to remove it I found the problem but it is not really an error of NOVOPlasty. It is more like a sequencing artifact where I don't know the cause of but I saw it once before At a certain point almost all reads stop like in a linear sequence. Only a few reads can create an extension and those reads lead to mirroring Or the mitochondrial sequence is linear or in some way all sequences were cut at that point (but usually the sample is randomly sheared) Those few reads that extend were probably there because of recombination during the PCR

Could you run again with this seed:

TGTTTATTATTCCTTAAGTATCTTAAGATAAGGCTGGGTAATTATGTAAGTAAAATCATCAATAGTTAAGGTCAAGGTAGGTTGTTTATTCAAGAGAAGATAAACTAACGACCTAAAATATACTTTCACCTTAAGTTTTAGTATTCTATAGTTTAAATTTCATAG

ndierckx avatar Sep 08 '18 16:09 ndierckx

Hi, sorry for the issue. Turns out that the mirrored results were correct. The mitogenome that I was assembling was dimerized, hence the "mirroring".

Thanks for the attention.

reinator avatar Nov 28 '18 16:11 reinator

How did you found out?

ndierckx avatar Nov 28 '18 16:11 ndierckx

I used SPAdes and Megahit with the reads returned by NovoPlasty. In most parts, only the monomer part of the genome was being generated (which made me suspicious about the way NovoPlasty was assembling the reads), but then in one contig assembled by megahit I also noticed the mirroring. When I mapped the reads to the contig, I saw that some reads mapped to the region that was originating the mirroring (which later I learned that it was a dimerizing process).

There is a paper where the author describes a dimerized mitogenome (https://link.springer.com/article/10.1007/s00239-007-9037-5).

reinator avatar Dec 04 '18 14:12 reinator

Ok thanks for the info!

ndierckx avatar Dec 06 '18 01:12 ndierckx