NOVOPlasty icon indicating copy to clipboard operation
NOVOPlasty copied to clipboard

Seed sequence for plant mitochondrial genome

Open bioramg opened this issue 4 years ago • 7 comments

Hello, I would like to assemble the plant mitochondrial genomes using Novoplasty. I read your recently published article, mentioned that RuBP used as a seed sequence for cp genome. But I would like to know which gene could be used as a seed sequence for plant mitochondrial genome assembly? Also, can I use multiple mitochondrial genes as a seed sequence for this denovo assembly? I gave 295 Kb size contig as a seed for plant mitochondrial assembly and obtained 430 Kb size of the contig in the output file. But the 430 kb contig is not similar to the input seed contig. Which one is correct? Thank you.

bioramg avatar Oct 15 '20 00:10 bioramg

If you have an assembly that you know is correct, you can use that as seed, but then you need to put the option "extend seed directly" to yes. It is important that ends are correct, if you doubt I would clip 200 bp or s (often more mistakes at the end of assembled contigs)

If you use this seed without "extend seed directly" it will only use the first 200 bp or so because it uses it to extract one seed from the dataset

So it is best to use a short seed from a region that is not in the chloroplast genome

Have you already assembled the chloroplast genome?

ndierckx avatar Oct 16 '20 18:10 ndierckx

Thank you for your response. Yes. I assembled the chloroplast genome. I am having 6 contigs assembled by SPADes assembler. These six contigs are having all mitochondrial genes. So, shall I extract only one mitochondrial gene from the contig and can use it as a seed input?

bioramg avatar Oct 26 '20 00:10 bioramg

Yes you could do that and don't forget to add the fasta file of the chloroplast sequence in the config file

ndierckx avatar Oct 26 '20 15:10 ndierckx

Yes. I have included chloroplast genome sequence and cox2 gene as a seed sequence for mitochondrial genome assembly. But I could not obtain consistent results. Should I improve or modify some other parameters?

bioramg avatar Oct 27 '20 00:10 bioramg

You can send the log file, I can check the parameters. But plant mitochondrial genomes are very hard to assemble, even with long reads it mostly fails

ndierckx avatar Oct 27 '20 21:10 ndierckx

Thank you. But Unfortunately deleted cox2 gene-seed file data. If you need it, I can re-run and give you a log file.

I am herewith enclosing the log file of 295K contig.

log_cmo_mt.txt

bioramg avatar Oct 28 '20 01:10 bioramg

You do get quite large contigs, it is better than most cases I saw. You can try different assemblers but I don't think you will succeed for a circular genome..

ndierckx avatar Nov 01 '20 22:11 ndierckx