megahit
megahit copied to clipboard
Fragmented contigs
Hello,
I am looking at the contig graph produced using contig2fastg. I saw sequential contigs with no branching.
Maybe I am missing something but if there is no branching why are those contigs not merged into a single one? Do I need to process the assembly and look for those?
Here is a bandade close up. Also, seems that the coverage for both is 1 is it relevant?
Best,
Seb
Interesting. Could you show me the two sequences and the k used?
Hello, So I did a grep on the .gfa to extract sequence and links, that's this file. If you are interested I have other examples. Also, what I said previously about coverage being 1 was a mistake on my part, it's ~5. k=141 Fragmented_contigs.txt Best, Seb
The two sequences do not share any common k-mers, and I doubt why they are connected. Did you use megahit_toolkit's fasta2fastg to generate the graph?
Hi, They do share a 141 long kmer but one of them need to be reverse complemented. I didn't realise it when first submitted. It's because I'm using the .gfa format which only store sequence in one orientation so the files are less heavy. So, yes I am usually using megahit_toolkit contig2fastg to generate the graph. I then use one of Bandage function to translate the fastg file in .gfa . I need then script for renaming purpose, as translation as megahit_toolkit contig2fastg change the name of the contigs. Though it keeps things in the same order, so it is fine. Best, Seb