NOVOPlasty
NOVOPlasty copied to clipboard
Contig result is mirrored
There is a problem with these recent versions where the contig result present a mirroring problem, like the image above:
Clearly the selected region is mirroring with the other region. And when we separate the regions and assemble them, it is confirmed.
Please, fix this. NovoPlasty is a very good tool and I'm looking forward to cite it in my researches.
Best regards.
Hi,
Don't really understand the problem, which version do you use? And can you send me the log file?
I use the version 2.7.1. I will run on the version 2.7.2 to see if the problem continues.
But just to explain, imagine that the result should be: ACGTAAAAAACCCCCCC
Instead, NovoPlasty is returning: ACGTAAAAAACCCCCCCGGGGGGGTTTTTTACGT
where clearly the bold sequence is mirrored (reverse complement) with the normal sequence.
I have already seen situations like that in other executions with other samples, and since I saw that the problem remained in the version 2.7.1, I decided to create this issue.
Are you sure it is an assembly mistake? These genetic structures can occur in mitochondrial genomes
Yes, I'm sure. I choose the option to save the assembled reads and when I use them in other assembler, like SPAdes, this assembly mistake do not happen.
The example that I gave in the last post is a small one, but the situation showed in the figures that I sent are of 10000 nucleotides.
I never saw this problem before, so could you send me the extended log to check if it is a mistake? Maybe you will have to run again (with that option to 1)
I ran again with the extended log enabled, which is attached. log_extended_mito_k39.txt
Sorry I didn't had time to reply. I will look at it better tomorrow, I think there is indeed a mistake during the repetitive region, but it just repeats everything (not mirrored?)
It repeats and generates the reverse complement of the inversion, which I called mirroring. If you see the images that I firts sent, the idea of mirroring gets clearer.
If you need further information regarding this issue, I could send you the contig, via email. Thanks in advance.
I can get your contig out of the extended log so maybe you want to remove it I found the problem but it is not really an error of NOVOPlasty. It is more like a sequencing artifact where I don't know the cause of but I saw it once before At a certain point almost all reads stop like in a linear sequence. Only a few reads can create an extension and those reads lead to mirroring Or the mitochondrial sequence is linear or in some way all sequences were cut at that point (but usually the sample is randomly sheared) Those few reads that extend were probably there because of recombination during the PCR
Could you run again with this seed:
TGTTTATTATTCCTTAAGTATCTTAAGATAAGGCTGGGTAATTATGTAAGTAAAATCATCAATAGTTAAGGTCAAGGTAGGTTGTTTATTCAAGAGAAGATAAACTAACGACCTAAAATATACTTTCACCTTAAGTTTTAGTATTCTATAGTTTAAATTTCATAG
Hi, sorry for the issue. Turns out that the mirrored results were correct. The mitogenome that I was assembling was dimerized, hence the "mirroring".
Thanks for the attention.
How did you found out?
I used SPAdes and Megahit with the reads returned by NovoPlasty. In most parts, only the monomer part of the genome was being generated (which made me suspicious about the way NovoPlasty was assembling the reads), but then in one contig assembled by megahit I also noticed the mirroring. When I mapped the reads to the contig, I saw that some reads mapped to the region that was originating the mirroring (which later I learned that it was a dimerizing process).
There is a paper where the author describes a dimerized mitogenome (https://link.springer.com/article/10.1007/s00239-007-9037-5).
Ok thanks for the info!