NOVOPlasty
NOVOPlasty copied to clipboard
Seed sequence for plant mitochondrial genome
Hello, I would like to assemble the plant mitochondrial genomes using Novoplasty. I read your recently published article, mentioned that RuBP used as a seed sequence for cp genome. But I would like to know which gene could be used as a seed sequence for plant mitochondrial genome assembly? Also, can I use multiple mitochondrial genes as a seed sequence for this denovo assembly? I gave 295 Kb size contig as a seed for plant mitochondrial assembly and obtained 430 Kb size of the contig in the output file. But the 430 kb contig is not similar to the input seed contig. Which one is correct? Thank you.
If you have an assembly that you know is correct, you can use that as seed, but then you need to put the option "extend seed directly" to yes. It is important that ends are correct, if you doubt I would clip 200 bp or s (often more mistakes at the end of assembled contigs)
If you use this seed without "extend seed directly" it will only use the first 200 bp or so because it uses it to extract one seed from the dataset
So it is best to use a short seed from a region that is not in the chloroplast genome
Have you already assembled the chloroplast genome?
Thank you for your response. Yes. I assembled the chloroplast genome. I am having 6 contigs assembled by SPADes assembler. These six contigs are having all mitochondrial genes. So, shall I extract only one mitochondrial gene from the contig and can use it as a seed input?
Yes you could do that and don't forget to add the fasta file of the chloroplast sequence in the config file
Yes. I have included chloroplast genome sequence and cox2 gene as a seed sequence for mitochondrial genome assembly. But I could not obtain consistent results. Should I improve or modify some other parameters?
You can send the log file, I can check the parameters. But plant mitochondrial genomes are very hard to assemble, even with long reads it mostly fails
Thank you. But Unfortunately deleted cox2 gene-seed file data. If you need it, I can re-run and give you a log file.
I am herewith enclosing the log file of 295K contig.
You do get quite large contigs, it is better than most cases I saw. You can try different assemblers but I don't think you will succeed for a circular genome..