MBG
MBG copied to clipboard
Assembly changes from run to run
MBG bioconda 1.0.16
Hi,
-
I am testing MBG on an organellar genome and noticed the results can change if you run the program multiple times. I couldn't find a seed parameter in the options to make the assembly reproducible. Any tips?
-
Below is an example of two consecutive runs of MBG on the same input with identical parameters and the results differ in length by 4 bp.
(mbg) [guibo205@rackham2 mbg]$ MBG -i mapped_reads.fasta -o asm.gfa -k 2501 -w 2470 -a 10 -u 50 -t 8
MBG bioconda 1.0.16
Parameters: k=2501,w=2470,a=10,u=50,t=8,r=0,R=0,hpcvariantcov=0,errormasking=hpc,endkmers=no,blunt=no,keepgaps=no,guesswork=no,copycountfilter=no,onlylocal=no,filterwithinunitig=yes,cleaning=yes,cache=no
Collecting selected k-mers
Reading sequences from mapped_reads.fasta
5852 total selected k-mers in reads
1583 distinct selected k-mers in reads
Unitigifying
Filtering by unitig coverage
19 distinct selected k-mers in unitigs after filtering
Getting read paths
Reading sequences from mapped_reads.fasta
Building unitig sequences
Reading sequences from mapped_reads.fasta
Writing graph to asm.gfa
selecting k-mers and building graph topology took 0,617 s
unitigifying took 0,0 s
filtering unitigs took 0,0 s
getting read paths took 0,704 s
building unitig sequences took 0,809 s
forcing edge consistency took 0,0 s
writing the graph and calculating stats took 0,2 s
nodes: 1
edges: 1
assembly size 34948 bp, N50 34948
approximate number of k-mers ~ 32447
(mbg) [guibo205@rackham2 mbg]$ MBG -i mapped_reads.fasta -o asm2.gfa -k 2501 -w 2470 -a 10 -u 50 -t 8
MBG bioconda 1.0.16
Parameters: k=2501,w=2470,a=10,u=50,t=8,r=0,R=0,hpcvariantcov=0,errormasking=hpc,endkmers=no,blunt=no,keepgaps=no,guesswork=no,copycountfilter=no,onlylocal=no,filterwithinunitig=yes,cleaning=yes,cache=no
Collecting selected k-mers
Reading sequences from mapped_reads.fasta
5852 total selected k-mers in reads
1583 distinct selected k-mers in reads
Unitigifying
Filtering by unitig coverage
19 distinct selected k-mers in unitigs after filtering
Getting read paths
Reading sequences from mapped_reads.fasta
Building unitig sequences
Reading sequences from mapped_reads.fasta
Writing graph to asm2.gfa
selecting k-mers and building graph topology took 0,620 s
unitigifying took 0,0 s
filtering unitigs took 0,0 s
getting read paths took 0,699 s
building unitig sequences took 0,731 s
forcing edge consistency took 0,0 s
writing the graph and calculating stats took 0,2 s
nodes: 1
edges: 1
assembly size 34944 bp, N50 34944
approximate number of k-mers ~ 32443
Hi, this is a bug. Could you share the input reads?