Flye
Flye copied to clipboard
Flye assembly produces differentiating output on the same files.
Hi, Thanks for your great work on enabling de novo assembly on ONT data!
I have been working on a snakemake pipeline for assembling and polishing bacterial data, and have in this regards stumbled upon a strange observation:
The final assembly sizes would sometimes be randomized between duplicate runs.
I did some detective work on the pipeline and came across the output from the Flye assembler.
Here's my findings on performing wc
on the assembly.fasta files:
Sample Newlines Words Ncharecters setting preprocessing
Barcode01 92271 92271 5627747 --nano-raw Q12-porechop, necat_correction_4iterations
Barcode01 92271 92271 5627741 --nano-raw Q12-porechop, necat_correction_4iterations
Barcode01 93724 93724 5716532 --nano-raw Q12-porechop
Barcode01 93724 93724 5716562 --nano-raw Q12-porechop
Barcode01 92460 92460 5639810 --nano-corr Q12-porechop, necat_correction_3iterations
Barcode01 92464 92464 5639982 --nano-corr Q12-porechop, necat_correction_3iterations
Barcode01 92461 92461 5639812 --nano-corr Q12-porechop, necat_correction_4iterations
Barcode01 92462 92462 5639890 --nano-corr Q12-porechop, necat_correction_4iterations
Barcode02 100512 100512 6130631 --nano-raw Q12-porechop, necat_correction_4iterations
Barcode02 100512 100512 6130622 --nano-raw Q12-porechop, necat_correction_4iterations
Barcode02 100832 100832 6149873 --nano-raw Q12-porechop
Barcode02 100830 100830 6149840 --nano-raw Q12-porechop
Barcode02 100839 100839 6150686 --nano-corr Q12-porechop, necat_correction_3iterations
Barcode02 100676 100676 6140754 --nano-corr Q12-porechop, necat_correction_3iterations
Barcode02 100568 100568 6134214 --nano-corr Q12-porechop, necat_correction_4iterations
Barcode02 100582 100582 6135040 --nano-corr Q12-porechop, necat_correction_4iterations
Barcode03 99624 99624 6076394 --nano-raw Q12-porechop, necat_correction_4iterations
Barcode03 99624 99624 6076483 --nano-raw Q12-porechop, necat_correction_4iterations
Barcode03 100240 100240 6114074 --nano-raw Q12-porechop
Barcode03 100240 100240 6114088 --nano-raw Q12-porechop
Barcode03 100579 100579 6134668 --nano-corr Q12-porechop, necat_correction_3iterations
Barcode03 100255 100255 6114959 --nano-corr Q12-porechop, necat_correction_3iterations
Barcode03 101382 101382 6183694 --nano-corr Q12-porechop, necat_correction_4iterations
Barcode03 101213 101213 6173408 --nano-corr Q12-porechop, necat_correction_4iterations
Barcode04 94869 94869 5786459 --nano-raw Q12-porechop, necat_correction_4iterations
Barcode04 94869 94869 5786459 --nano-raw Q12-porechop, necat_correction_4iterations
Barcode04 95032 95032 5796730 --nano-raw Q12-porechop
Barcode04 95032 95032 5796730 --nano-raw Q12-porechop
Barcode04 94958 94958 5792259 --nano-corr Q12-porechop, necat_correction_3iterations
Barcode04 94958 94958 5792259 --nano-corr Q12-porechop, necat_correction_3iterations
Barcode04 95031 95031 5796696 --nano-corr Q12-porechop, necat_correction_4iterations
Barcode04 95031 95031 5796706 --nano-corr Q12-porechop, necat_correction_4iterations
Barcode05 98977 98977 6037416 --nano-raw Q12-porechop, necat_correction_4iterations
Barcode05 98977 98977 6037416 --nano-raw Q12-porechop, necat_correction_4iterations
Barcode05 98980 98980 6037556 --nano-raw Q12-porechop
Barcode05 98980 98980 6037532 --nano-raw Q12-porechop
Barcode05 98775 98775 6025097 --nano-corr Q12-porechop, necat_correction_3iterations
Barcode05 98775 98775 6025096 --nano-corr Q12-porechop, necat_correction_3iterations
Barcode05 98979 98979 6037543 --nano-corr Q12-porechop, necat_correction_4iterations
Barcode05 98979 98979 6037543 --nano-corr Q12-porechop, necat_correction_4iterations
It seems that files sizes differentiates the most, when using corrected outputs (--nano-corr
) however we can also observe a bit of differences from assuming raw nano-reads (--nano-raw
) (And yes, I have confirmed that necat-correction did not yield different amount of lines, words, and characters!)
I'm using Flye 2.9 (installed by Snakemake through conda)
I'm just a random guy who happened to notice this, not a developer, but it is the expected behavior that the assemblies are ever-so-slightly different from run to run on the same files b/c there is at least one step that is not deterministic. Reading the Flye paper and understanding that step will likely help you understand what you're seeing here.
@JohnUrban exactly, this is because of indeterminism that is difficult to get rid of.
Also see https://github.com/fenderglass/Flye/issues/298
@JohnUrban and @fenderglass Thank you both for the clarifications. I have been looking into the 2019 paper, and also been lurking into the referenced issues (also #244).
There is a multiprocessing stage (disjointig generation) where the results slightly dependind on the thread scheduling by OS (e.g. which thread finishes first).
Do I understand things correctly if the randomness is purely associated with multi threading?
If that is the case, then hypothetically, would the output then be identical when running only with a single core?
@KasperThystrup yes, it is due to multithreading. I think theoretically the output should be identical with a single core. But I have not tested this.
@KasperThystrup Hey let us know what you find with a single core if you go that route.
@fenderglass & @JohnUrban I expect to set up a few experimental runs with single cores later today. I expect to report back sometime next week :-)
So... Rerunning with only single cores brought our suspecion to light.. Yes bit sizes, line count and character counts are completely identical between two runs.
This sets up the question: Is it possible and feasible to setup the disjointig part for single core processing, while maintaining the rest of the assembly work in "multiprocessing-mode"??
@KasperThystrup yes, you can edit line 44 in flye/assembly/assemble.py
@fenderglass Thanks for the elaboration. Would you be interested to have an implementation of an(other) optional argument to run this step single core only?
Otherwise I will close this issue.
@KasperThystrup sure, please do a pull request then!
Hi @fenderglass,
I've been doing some testing of Flye with the same input read sets and different thread counts. It goes without saying Flye is amazing and thanks for making & maintaining it!
After learning of and applying --deterministic
as a flag, I assumed the output would be deterministic, but it wasn't.
Based on what I can see, the first sign of non-determinism occurs at the Initial divergence estimate
step (I've pasted the log files below).
I'm going to guess that rand()
in line 764 [here] (https://github.com/fenderglass/Flye/blob/e156df433cc5d66bfc3663db716d8c1abc1969b6/src/sequence/overlap.cpp#LL764C4-L764C4) is the source of the non-determinism, but I don't know C++.
Just putting this out there for anyone who has come across this :)
Edit
There are very small differences in the Initial divergence estimate
step even when I run the same command twice with 1 thread as well.
George
1 Thread
[2023-05-16 01:58:53] root: INFO: Starting Flye 2.9.2-b1786
[2023-05-16 01:58:53] root: DEBUG: Cmd: /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/bin/flye --nano-raw ../real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/final_filtered_long_reads.fastq.gz --out-dir ../real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF --threads 1 --deterministic
[2023-05-16 01:58:53] root: DEBUG: Python version: 3.7.8 | packaged by conda-forge | (default, Jul 31 2020, 02:25:08)
[GCC 7.5.0]
[2023-05-16 01:58:53] root: INFO: >>>STAGE: configure
[2023-05-16 01:58:53] root: INFO: Configuring run
[2023-05-16 01:58:58] root: INFO: Total read length: 278879070
[2023-05-16 01:58:58] root: INFO: Reads N50/N90: 16438 / 4092
[2023-05-16 01:58:58] root: INFO: Minimum overlap set to 4000
[2023-05-16 01:58:58] root: INFO: >>>STAGE: assembly
[2023-05-16 01:58:58] root: INFO: Assembling disjointigs
[2023-05-16 01:58:58] root: DEBUG: -----Begin assembly log------
[2023-05-16 01:58:58] root: DEBUG: Running: flye-modules assemble --reads /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/final_filtered_long_reads.fastq.gz --out-asm /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/00-assembly/draft_assembly.fasta --config /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg --log /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/flye.log --threads 1 --min-ovlp 4000
[2023-05-16 01:58:58] DEBUG: Build date: May 13 2023 06:30:35
[2023-05-16 01:58:58] DEBUG: Total RAM: 62 Gb
[2023-05-16 01:58:58] DEBUG: Available RAM: 28 Gb
[2023-05-16 01:58:58] DEBUG: Total CPUs: 16
[2023-05-16 01:58:58] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg
[2023-05-16 01:58:58] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_defaults.cfg
[2023-05-16 01:58:58] DEBUG: big_genome_threshold=29000000
[2023-05-16 01:58:58] DEBUG: meta_read_filter_kmer_freq=100
[2023-05-16 01:58:58] DEBUG: chain_large_gap_penalty=2
[2023-05-16 01:58:58] DEBUG: chain_small_gap_penalty=0.5
[2023-05-16 01:58:58] DEBUG: chain_gap_jump_threshold=100
[2023-05-16 01:58:58] DEBUG: max_coverage_drop_rate=5
[2023-05-16 01:58:58] DEBUG: max_extensions_drop_rate=5
[2023-05-16 01:58:58] DEBUG: chimera_window=100
[2023-05-16 01:58:58] DEBUG: chimera_overhang=1000
[2023-05-16 01:58:58] DEBUG: min_reads_in_disjointig=4
[2023-05-16 01:58:58] DEBUG: max_inner_reads=10
[2023-05-16 01:58:58] DEBUG: max_inner_fraction=0.25
[2023-05-16 01:58:58] DEBUG: max_separation=500
[2023-05-16 01:58:58] DEBUG: unique_edge_length=50000
[2023-05-16 01:58:58] DEBUG: min_repeat_res_support=0.51
[2023-05-16 01:58:58] DEBUG: out_paths_ratio=5
[2023-05-16 01:58:58] DEBUG: graph_cov_drop_rate=5
[2023-05-16 01:58:58] DEBUG: coverage_estimate_window=100
[2023-05-16 01:58:58] DEBUG: max_bubble_length=50000
[2023-05-16 01:58:58] DEBUG: loop_coverage_rate=1.5
[2023-05-16 01:58:58] DEBUG: repeat_edge_cov_mult=1.75
[2023-05-16 01:58:58] DEBUG: weak_detach_rate=5
[2023-05-16 01:58:58] DEBUG: tip_coverage_rate=2
[2023-05-16 01:58:58] DEBUG: tip_length_rate=2
[2023-05-16 01:58:58] DEBUG: output_gfa_before_rr=0
[2023-05-16 01:58:58] DEBUG: remove_alt_edges=0
[2023-05-16 01:58:58] DEBUG: low_cutoff_warning=1
[2023-05-16 01:58:58] DEBUG: kmer_size=17
[2023-05-16 01:58:58] DEBUG: use_minimizers=0
[2023-05-16 01:58:58] DEBUG: reads_base_alignment=0
[2023-05-16 01:58:58] DEBUG: meta_read_top_kmer_rate=0.40
[2023-05-16 01:58:58] DEBUG: maximum_jump=1500
[2023-05-16 01:58:58] DEBUG: maximum_overhang=1500
[2023-05-16 01:58:58] DEBUG: repeat_kmer_rate=100
[2023-05-16 01:58:58] DEBUG: assemble_ovlp_divergence=0.10
[2023-05-16 01:58:58] DEBUG: assemble_divergence_relative=1
[2023-05-16 01:58:58] DEBUG: repeat_graph_ovlp_divergence=0.08
[2023-05-16 01:58:58] DEBUG: read_align_ovlp_divergence=0.25
[2023-05-16 01:58:58] DEBUG: hpc_scoring_on=0
[2023-05-16 01:58:58] DEBUG: add_unassembled_reads=0
[2023-05-16 01:58:58] DEBUG: extend_contigs_with_repeats=0
[2023-05-16 01:58:58] DEBUG: min_read_cov_cutoff=3
[2023-05-16 01:58:58] DEBUG: short_tip_length=20000
[2023-05-16 01:58:58] DEBUG: long_tip_length=100000
[2023-05-16 01:58:58] DEBUG: Running with k-mer size: 17
[2023-05-16 01:58:58] DEBUG: Running with minimum overlap 4000
[2023-05-16 01:58:58] DEBUG: Metagenome mode: N
[2023-05-16 01:58:58] DEBUG: Short mode: N
[2023-05-16 01:58:58] INFO: Reading sequences
[2023-05-16 01:59:02] DEBUG: Building positional index
[2023-05-16 01:59:02] DEBUG: Total sequence: 251729388 bp
[2023-05-16 01:59:04] INFO: Counting k-mers:
[2023-05-16 02:00:10] DEBUG: Updating k-mer histogram
[2023-05-16 02:01:05] DEBUG: Hash size: 4622497
[2023-05-16 02:01:05] DEBUG: Total k-mers 88307420
[2023-05-16 02:01:05] INFO: Filling index table (1/2)
[2023-05-16 02:04:03] DEBUG: Mean k-mer frequency: 21.8363
[2023-05-16 02:04:03] DEBUG: Repetitive k-mer frequency: 2183
[2023-05-16 02:04:03] DEBUG: Filtered 18624 repetitive k-mers (0.000182569)
[2023-05-16 02:04:04] INFO: Filling index table (2/2)
[2023-05-16 02:08:04] DEBUG: Sorting k-mer index
[2023-05-16 02:08:05] DEBUG: Selected k-mers: 6031079
[2023-05-16 02:08:05] DEBUG: Index size: 103351604
[2023-05-16 02:08:05] DEBUG: Mean k-mer index frequency: 17.1365
[2023-05-16 02:08:05] DEBUG: Peak RAM usage: 8 Gb
[2023-05-16 02:08:05] DEBUG: Estimating k-mer identity bias
[2023-05-16 02:09:10] DEBUG: Initial divergence estimate : 0.0723839
[2023-05-16 02:09:10] DEBUG: Relative threshold: Y
[2023-05-16 02:09:10] DEBUG: Max divergence threshold set to 0.172384
[2023-05-16 02:09:10] INFO: Extending reads
[2023-05-16 02:09:10] DEBUG: Estimating overlap coverage
[2023-05-16 02:10:06] INFO: Overlap-based coverage: 48
[2023-05-16 02:10:06] INFO: Median overlap divergence: 0.0722228
[2023-05-16 02:10:06] DEBUG: Sequence divergence distribution:
| * |
| * |
| * |
| ** |
| ** |
| ** |
| *** |
| **** |
| **** |
| ***** |
| ***** |
| ***** |
| ******* |
| ******* |
| ******** |
| ********* |
| ********** |
| ************* |
| *************** |
| *************************** **** ********** * *** * **
----------------------------------------------------------------------------------------------------
0% 5% 10% 15% 20% 25% 30% 35% 40% 45%
Q25 = 0.065, Q50 = 0.072, Q75 = 0.084
[2023-05-16 02:11:18] DEBUG: Assembled disjointig 1
With 362 reads
Start read: +d060018a-55a4-4b68-8293-21d683cfdd94
At position: 360
leftTip: 0 rightTip: 0
Suspicious: 2
Short ext: 2
Mean extensions: 39
Avg overlap len: 38689
Min overlap len: 1451
Inner reads: 0
Length: 4810091
[2023-05-16 02:11:18] DEBUG: Inner: 32900 covered: 32972 total: 35900
[2023-05-16 02:11:27] DEBUG: Assembled disjointig 2
With 16 reads
Start read: +7f476f63-9362-4273-aa0a-f6654120414b
At position: 5
leftTip: 1 rightTip: 0
Suspicious: 0
Short ext: 0
Mean extensions: 61
Avg overlap len: 56938
Min overlap len: 6090
Inner reads: 0
Length: 127874
[2023-05-16 02:11:27] DEBUG: Inner: 33808 covered: 33888 total: 35900
[2023-05-16 02:11:29] DEBUG: Assembled disjointig 3
With 19 reads
Start read: +aab73044-040d-495f-b66e-700e2731c431
At position: 6
leftTip: 0 rightTip: 0
Suspicious: 0
Short ext: 0
Mean extensions: 68
Avg overlap len: 7518
Min overlap len: 4400
Inner reads: 0
Length: 40601
[2023-05-16 02:11:29] DEBUG: Inner: 34308 covered: 34388 total: 35900
[2023-05-16 02:11:53] INFO: Assembled 3 disjointigs
[2023-05-16 02:11:54] INFO: Generating sequence
[2023-05-16 02:12:02] DEBUG: Building positional index
[2023-05-16 02:12:02] DEBUG: Total sequence: 4978787 bp
[2023-05-16 02:12:05] DEBUG: Mean k-mer frequency: 1.03604
[2023-05-16 02:12:05] DEBUG: Repetitive k-mer frequency: 103
[2023-05-16 02:12:05] DEBUG: Filtered 0 repetitive k-mers (0)
[2023-05-16 02:12:07] DEBUG: Sorting k-mer index
[2023-05-16 02:12:07] DEBUG: Selected k-mers: 4805532
[2023-05-16 02:12:07] DEBUG: K-mer index size: 4978736
[2023-05-16 02:12:07] DEBUG: Mean k-mer frequency: 1.03604
[2023-05-16 02:12:07] DEBUG: Minimizer rate: 1.00001
[2023-05-16 02:12:07] INFO: Filtering contained disjointigs
[2023-05-16 02:12:11] DEBUG: Computing transitive closure for overlaps
[2023-05-16 02:12:11] DEBUG: Found 12 overlaps
[2023-05-16 02:12:11] DEBUG: Left 12 overlaps after filtering
[2023-05-16 02:12:11] INFO: Contained seqs: 0
[2023-05-16 02:12:11] DEBUG: Writing FASTA
[2023-05-16 02:12:11] DEBUG: Peak RAM usage: 8 Gb
-----------End assembly log------------
[2023-05-16 02:12:12] root: DEBUG: Disjointigs length: 4978787, N50: 4809821
[2023-05-16 02:12:12] root: INFO: >>>STAGE: consensus
[2023-05-16 02:12:12] root: INFO: Running Minimap2
[2023-05-16 02:14:00] root: INFO: Computing consensus
[2023-05-16 02:17:19] root: INFO: Alignment error rate: 0.105022
[2023-05-16 02:17:19] root: INFO: >>>STAGE: repeat
[2023-05-16 02:17:19] root: INFO: Building and resolving repeat graph
[2023-05-16 02:17:19] root: DEBUG: -----Begin repeat analyser log------
[2023-05-16 02:17:19] root: DEBUG: Running: flye-modules repeat --disjointigs /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/10-consensus/consensus.fasta --reads /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/final_filtered_long_reads.fastq.gz --out-dir /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/20-repeat --config /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg --log /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/flye.log --threads 1 --min-ovlp 4000
[2023-05-16 02:17:19] DEBUG: Build date: May 13 2023 06:30:59
[2023-05-16 02:17:19] DEBUG: Total RAM: 62 Gb
[2023-05-16 02:17:19] DEBUG: Available RAM: 56 Gb
[2023-05-16 02:17:19] DEBUG: Total CPUs: 16
[2023-05-16 02:17:19] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg
[2023-05-16 02:17:19] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_defaults.cfg
[2023-05-16 02:17:19] DEBUG: big_genome_threshold=29000000
[2023-05-16 02:17:19] DEBUG: meta_read_filter_kmer_freq=100
[2023-05-16 02:17:19] DEBUG: chain_large_gap_penalty=2
[2023-05-16 02:17:19] DEBUG: chain_small_gap_penalty=0.5
[2023-05-16 02:17:19] DEBUG: chain_gap_jump_threshold=100
[2023-05-16 02:17:19] DEBUG: max_coverage_drop_rate=5
[2023-05-16 02:17:19] DEBUG: max_extensions_drop_rate=5
[2023-05-16 02:17:19] DEBUG: chimera_window=100
[2023-05-16 02:17:19] DEBUG: chimera_overhang=1000
[2023-05-16 02:17:19] DEBUG: min_reads_in_disjointig=4
[2023-05-16 02:17:19] DEBUG: max_inner_reads=10
[2023-05-16 02:17:19] DEBUG: max_inner_fraction=0.25
[2023-05-16 02:17:19] DEBUG: max_separation=500
[2023-05-16 02:17:19] DEBUG: unique_edge_length=50000
[2023-05-16 02:17:19] DEBUG: min_repeat_res_support=0.51
[2023-05-16 02:17:19] DEBUG: out_paths_ratio=5
[2023-05-16 02:17:19] DEBUG: graph_cov_drop_rate=5
[2023-05-16 02:17:19] DEBUG: coverage_estimate_window=100
[2023-05-16 02:17:19] DEBUG: max_bubble_length=50000
[2023-05-16 02:17:19] DEBUG: loop_coverage_rate=1.5
[2023-05-16 02:17:19] DEBUG: repeat_edge_cov_mult=1.75
[2023-05-16 02:17:19] DEBUG: weak_detach_rate=5
[2023-05-16 02:17:19] DEBUG: tip_coverage_rate=2
[2023-05-16 02:17:19] DEBUG: tip_length_rate=2
[2023-05-16 02:17:19] DEBUG: output_gfa_before_rr=0
[2023-05-16 02:17:19] DEBUG: remove_alt_edges=0
[2023-05-16 02:17:19] DEBUG: low_cutoff_warning=1
[2023-05-16 02:17:19] DEBUG: kmer_size=17
[2023-05-16 02:17:19] DEBUG: use_minimizers=0
[2023-05-16 02:17:19] DEBUG: reads_base_alignment=0
[2023-05-16 02:17:19] DEBUG: meta_read_top_kmer_rate=0.40
[2023-05-16 02:17:19] DEBUG: maximum_jump=1500
[2023-05-16 02:17:19] DEBUG: maximum_overhang=1500
[2023-05-16 02:17:19] DEBUG: repeat_kmer_rate=100
[2023-05-16 02:17:19] DEBUG: assemble_ovlp_divergence=0.10
[2023-05-16 02:17:19] DEBUG: assemble_divergence_relative=1
[2023-05-16 02:17:19] DEBUG: repeat_graph_ovlp_divergence=0.08
[2023-05-16 02:17:19] DEBUG: read_align_ovlp_divergence=0.25
[2023-05-16 02:17:19] DEBUG: hpc_scoring_on=0
[2023-05-16 02:17:19] DEBUG: add_unassembled_reads=0
[2023-05-16 02:17:19] DEBUG: extend_contigs_with_repeats=0
[2023-05-16 02:17:19] DEBUG: min_read_cov_cutoff=3
[2023-05-16 02:17:19] DEBUG: short_tip_length=20000
[2023-05-16 02:17:19] DEBUG: long_tip_length=100000
[2023-05-16 02:17:19] DEBUG: Running with k-mer size: 17
[2023-05-16 02:17:19] DEBUG: Selected minimum overlap 4000
[2023-05-16 02:17:19] DEBUG: Metagenome mode: N
[2023-05-16 02:17:19] INFO: Parsing disjointigs
[2023-05-16 02:17:19] DEBUG: Building positional index
[2023-05-16 02:17:19] DEBUG: Total sequence: 5027082 bp
[2023-05-16 02:17:19] INFO: Building repeat graph
[2023-05-16 02:17:22] DEBUG: Mean k-mer frequency: 1.05981
[2023-05-16 02:17:22] DEBUG: Repetitive k-mer frequency: 105
[2023-05-16 02:17:22] DEBUG: Filtered 271 repetitive k-mers (5.39086e-05)
[2023-05-16 02:17:24] DEBUG: Sorting k-mer index
[2023-05-16 02:17:24] DEBUG: Selected k-mers: 4743325
[2023-05-16 02:17:24] DEBUG: K-mer index size: 5026760
[2023-05-16 02:17:24] DEBUG: Mean k-mer frequency: 1.05975
[2023-05-16 02:17:24] DEBUG: Minimizer rate: 1.00006
[2023-05-16 02:17:28] DEBUG: Computing transitive closure for overlaps
[2023-05-16 02:17:28] DEBUG: Found 240 overlaps
[2023-05-16 02:17:28] DEBUG: Left 100 overlaps after filtering
[2023-05-16 02:17:28] INFO: Median overlap divergence: 0.0245107
[2023-05-16 02:17:28] DEBUG: Sequence divergence distribution:
|* |
|* |
|* |
|* |
|* |
|* * |
|* * |
|* * |
|* * |
|* * |
|** * * * | *
|** * * * | *
|** * * * | *
|** * * * | *
|** * * * | *
|** ** * ** | *
|** ** * ** | *
|** ** * ** | *
|** ** * ** | *
|** ** * ** | *
----------------------------------------------------------------------------------------------------
0% 5% 10% 15% 20% 25% 30% 35% 40% 45%
Q25 = 0.0098, Q50 = 0.025, Q75 = 0.052
[2023-05-16 02:17:28] DEBUG: Computing gluepoints
[2023-05-16 02:17:28] DEBUG: Added 0 gluepoint projections
[2023-05-16 02:17:28] DEBUG: Created 62 gluepoints
[2023-05-16 02:17:28] DEBUG: Artificial loops removed: 0 left, 0 right, 0 both
[2023-05-16 02:17:28] DEBUG: Initializing edges
[2023-05-16 02:17:28] DEBUG: Edges length checksum: 18446744071951779876
[2023-05-16 02:17:28] DEBUG: Filtered 0 singleton segments
[2023-05-16 02:17:28] DEBUG: Removed 0 simple and 0 double chimeric junctions
[2023-05-16 02:17:28] DEBUG: Collapsed 2 edges
[2023-05-16 02:17:28] DEBUG: * 18 +disjointig_1 105063 513261 408198
[2023-05-16 02:17:28] DEBUG: 4 +disjointig_1 513261 518393 5132
[2023-05-16 02:17:28] DEBUG: * -8 +disjointig_1 518393 551279 32886
[2023-05-16 02:17:28] DEBUG: 4 +disjointig_1 551279 556589 5310
[2023-05-16 02:17:28] DEBUG: * -7 +disjointig_1 556589 681739 125150
[2023-05-16 02:17:28] DEBUG: 4 +disjointig_1 681739 687055 5316
[2023-05-16 02:17:28] DEBUG: * -6 +disjointig_1 687055 771757 84702
[2023-05-16 02:17:28] DEBUG: 4 +disjointig_1 771757 776907 5150
[2023-05-16 02:17:28] DEBUG: * 11 +disjointig_1 776907 1262839 485932
[2023-05-16 02:17:28] DEBUG: -4 +disjointig_1 1262839 1268131 5292
[2023-05-16 02:17:28] DEBUG: * 5 +disjointig_1 1268131 1942573 674442
[2023-05-16 02:17:28] DEBUG: -4 +disjointig_1 1942573 1947985 5412
[2023-05-16 02:17:28] DEBUG: * 9 +disjointig_1 1947985 2685042 737057
[2023-05-16 02:17:28] DEBUG: 12 +disjointig_1 2685042 2713831 28789
[2023-05-16 02:17:28] DEBUG: * -13 +disjointig_1 2713831 2991198 277367
[2023-05-16 02:17:28] DEBUG: 12 +disjointig_1 2991198 3019982 28784
[2023-05-16 02:17:28] DEBUG: * -10 +disjointig_1 3019982 4550446 1530464
[2023-05-16 02:17:28] DEBUG: 4 +disjointig_1 4550446 4555843 5397
[2023-05-16 02:17:28] DEBUG: * 17 +disjointig_1 4555843 4856779 300936
[2023-05-16 02:17:28] DEBUG: 14 +disjointig_2 0 129638 129638
[2023-05-16 02:17:28] DEBUG: 19 +disjointig_3 3683 40645 36962
[2023-05-16 02:17:28] DEBUG: Total edges: 19
[2023-05-16 02:17:28] INFO: Parsing reads
[2023-05-16 02:17:31] DEBUG: Building positional index
[2023-05-16 02:17:31] DEBUG: Total sequence: 278879070 bp
[2023-05-16 02:17:31] DEBUG: Building positional index
[2023-05-16 02:17:31] DEBUG: Total sequence: 4918316 bp
[2023-05-16 02:17:31] INFO: Aligning reads to the graph
[2023-05-16 02:17:34] DEBUG: Mean k-mer frequency: 1.03885
[2023-05-16 02:17:34] DEBUG: Repetitive k-mer frequency: 103
[2023-05-16 02:17:34] DEBUG: Filtered 262 repetitive k-mers (5.32745e-05)
[2023-05-16 02:17:36] DEBUG: Sorting k-mer index
[2023-05-16 02:17:36] DEBUG: Selected k-mers: 4733997
[2023-05-16 02:17:36] DEBUG: K-mer index size: 4917663
[2023-05-16 02:17:36] DEBUG: Mean k-mer frequency: 1.0388
[2023-05-16 02:17:36] DEBUG: Minimizer rate: 1.00013
[2023-05-16 02:20:31] DEBUG: Total reads : 17950
[2023-05-16 02:20:31] DEBUG: Read with aligned parts : 17458
[2023-05-16 02:20:31] DEBUG: Aligned in one piece : 17383
[2023-05-16 02:20:31] INFO: Aligned read sequence: 243817236 / 251729388 (0.968569)
[2023-05-16 02:20:31] INFO: Median overlap divergence: 0.0275226
[2023-05-16 02:20:31] DEBUG: Sequence divergence distribution:
| * |
| * |
| * |
| * |
| ** |
| ** |
| ** |
| ** |
| *** |
| **** |
| **** |
| **** |
| ***** |
| ***** |
| ***** |
| ****** |
| ******* |
| ******** |
| ********** |
| ******************** ********************************* * *
----------------------------------------------------------------------------------------------------
0% 5% 10% 15% 20% 25% 30% 35% 40% 45%
Q25 = 0.022, Q50 = 0.028, Q75 = 0.037
[2023-05-16 02:20:31] INFO: Mean edge coverage: 51
[2023-05-16 02:20:31] DEBUG: 4 len:5287 cov:398 mult:7.80392
[2023-05-16 02:20:31] DEBUG: -4 len:5287 cov:398 mult:7.80392
[2023-05-16 02:20:31] DEBUG: 5 len:674442 cov:50 mult:0.980392
[2023-05-16 02:20:31] DEBUG: -5 len:674442 cov:50 mult:0.980392
[2023-05-16 02:20:31] DEBUG: 6 len:84702 cov:51 mult:1
[2023-05-16 02:20:31] DEBUG: -6 len:84702 cov:51 mult:1
[2023-05-16 02:20:31] DEBUG: 7 len:125150 cov:55 mult:1.07843
[2023-05-16 02:20:31] DEBUG: -7 len:125150 cov:55 mult:1.07843
[2023-05-16 02:20:31] DEBUG: 8 len:32886 cov:60 mult:1.17647
[2023-05-16 02:20:31] DEBUG: -8 len:32886 cov:60 mult:1.17647
[2023-05-16 02:20:31] DEBUG: 9 len:737057 cov:48 mult:0.941176
[2023-05-16 02:20:31] DEBUG: -9 len:737057 cov:48 mult:0.941176
[2023-05-16 02:20:31] DEBUG: 10 len:1530464 cov:47 mult:0.921569
[2023-05-16 02:20:31] DEBUG: -10 len:1530464 cov:47 mult:0.921569
[2023-05-16 02:20:31] DEBUG: 11 len:485932 cov:56 mult:1.09804
[2023-05-16 02:20:31] DEBUG: -11 len:485932 cov:56 mult:1.09804
[2023-05-16 02:20:31] DEBUG: 12 len:28786 cov:101 mult:1.98039
[2023-05-16 02:20:31] DEBUG: -12 len:28786 cov:101 mult:1.98039
[2023-05-16 02:20:31] DEBUG: 13 len:277367 cov:46 mult:0.901961
[2023-05-16 02:20:31] DEBUG: -13 len:277367 cov:46 mult:0.901961
[2023-05-16 02:20:31] DEBUG: 14 len:64819 cov:113 mult:2.21569
[2023-05-16 02:20:31] DEBUG: -14 len:64819 cov:113 mult:2.21569
[2023-05-16 02:20:31] DEBUG: 17 len:300936 cov:49 mult:0.960784
[2023-05-16 02:20:31] DEBUG: -17 len:300936 cov:49 mult:0.960784
[2023-05-16 02:20:31] DEBUG: 18 len:408198 cov:49 mult:0.960784
[2023-05-16 02:20:31] DEBUG: -18 len:408198 cov:49 mult:0.960784
[2023-05-16 02:20:31] DEBUG: 19 len:18481 cov:110 mult:2.15686
[2023-05-16 02:20:31] DEBUG: -19 len:18481 cov:110 mult:2.15686
[2023-05-16 02:20:31] DEBUG: Unique coverage threshold 96
[2023-05-16 02:20:31] INFO: Simplifying the graph
[2023-05-16 02:20:31] DEBUG: Read coverage cutoff: 10
[2023-05-16 02:20:31] DEBUG: [SIMPL] Removed 0 paths with low coverage
[2023-05-16 02:20:31] DEBUG: [SIMPL] Masked 0 heterozygous loops
[2023-05-16 02:20:31] DEBUG: [SIMPL] Masked 0 simple bubbles
[2023-05-16 02:20:31] DEBUG: Finding repeats
[2023-05-16 02:20:31] DEBUG: Read coverage cutoff: 10
[2023-05-16 02:20:31] DEBUG: High-cov: 4 5287 398
[2023-05-16 02:20:31] DEBUG: High-cov: 12 28786 101
[2023-05-16 02:20:31] DEBUG: High-cov: 14 64819 113
[2023-05-16 02:20:31] DEBUG: High-cov: 19 18481 110
[2023-05-16 02:20:31] DEBUG: Repeat detection iteration 1
[2023-05-16 02:20:31] DEBUG: Writing Dot
[2023-05-16 02:20:32] DEBUG: Writing FASTA
[2023-05-16 02:20:32] DEBUG: [SIMPL] == Iteration 1 ==
[2023-05-16 02:20:32] DEBUG: Splitting nodes
[2023-05-16 02:20:32] DEBUG: [SIMPL] Split 0 nodes
[2023-05-16 02:20:32] DEBUG: [SIMPL] Clipped 0 short and 0 long tips
[2023-05-16 02:20:32] DEBUG: [SIMPL] Masked 0 heterozygous loops
[2023-05-16 02:20:32] DEBUG: [SIMPL] Masked 0 simple bubbles
[2023-05-16 02:20:32] DEBUG: Finding repeats
[2023-05-16 02:20:32] DEBUG: Read coverage cutoff: 10
[2023-05-16 02:20:32] DEBUG: High-cov: 4 5287 398
[2023-05-16 02:20:32] DEBUG: High-cov: 12 28786 101
[2023-05-16 02:20:32] DEBUG: High-cov: 14 64819 113
[2023-05-16 02:20:32] DEBUG: High-cov: 19 18481 110
[2023-05-16 02:20:32] DEBUG: Repeat detection iteration 1
[2023-05-16 02:20:32] DEBUG: Total unique edges: 10
[2023-05-16 02:20:32] DEBUG: Connection -13 -10 7 1
[2023-05-16 02:20:32] DEBUG: Connection 13 -9 3 1
[2023-05-16 02:20:32] DEBUG: Connection -7 -6 33 1
[2023-05-16 02:20:32] DEBUG: Connection -10 17 29 1
[2023-05-16 02:20:32] DEBUG: Connection 5 9 25 1
[2023-05-16 02:20:32] DEBUG: Connection -6 11 23 1
[2023-05-16 02:20:32] DEBUG: Connection 8 -18 31 1
[2023-05-16 02:20:32] DEBUG: Connection -5 -11 29 1
[2023-05-16 02:20:32] DEBUG: Connection 7 8 37 1
[2023-05-16 02:20:32] DEBUG: [SIMPL] Resolved repeats: 9
[2023-05-16 02:20:32] DEBUG: RR links: 434
[2023-05-16 02:20:32] DEBUG: Unresolved: 0
[2023-05-16 02:20:32] DEBUG: Removed 0 simple and 0 double chimeric junctions
[2023-05-16 02:20:32] DEBUG: [SIMPL] == Iteration 2 ==
[2023-05-16 02:20:32] DEBUG: Splitting nodes
[2023-05-16 02:20:32] DEBUG: [SIMPL] Split 0 nodes
[2023-05-16 02:20:32] DEBUG: [SIMPL] Clipped 0 short and 0 long tips
[2023-05-16 02:20:32] DEBUG: [SIMPL] Masked 0 heterozygous loops
[2023-05-16 02:20:32] DEBUG: [SIMPL] Masked 0 simple bubbles
[2023-05-16 02:20:32] DEBUG: Finding repeats
[2023-05-16 02:20:32] DEBUG: Read coverage cutoff: 10
[2023-05-16 02:20:32] DEBUG: High-cov: 14 64819 113
[2023-05-16 02:20:32] DEBUG: High-cov: 19 18481 110
[2023-05-16 02:20:32] DEBUG: Repeat detection iteration 1
[2023-05-16 02:20:32] DEBUG: Total unique edges: 19
[2023-05-16 02:20:32] DEBUG: [SIMPL] Resolved repeats: 0
[2023-05-16 02:20:32] DEBUG: RR links: 0
[2023-05-16 02:20:32] DEBUG: Unresolved: 0
[2023-05-16 02:20:32] DEBUG: Removed 0 simple and 0 double chimeric junctions
[2023-05-16 02:20:32] DEBUG: [SIMPL] Collapsed 0 haplotypes
[2023-05-16 02:20:32] DEBUG: [SIMPL] Resolved 0 simple repeats
[2023-05-16 02:20:32] DEBUG: Read coverage cutoff: 10
[2023-05-16 02:20:32] DEBUG: [SIMPL] Removed 0 paths with low coverage
[2023-05-16 02:20:32] DEBUG: Finding repeats
[2023-05-16 02:20:32] DEBUG: Read coverage cutoff: 10
[2023-05-16 02:20:32] DEBUG: High-cov: 14 64819 113
[2023-05-16 02:20:32] DEBUG: High-cov: 19 18481 110
[2023-05-16 02:20:32] DEBUG: Repeat detection iteration 1
[2023-05-16 02:20:32] DEBUG: Writing Dot
[2023-05-16 02:20:32] DEBUG: Writing FASTA
[2023-05-16 02:20:32] DEBUG: Peak RAM usage: 0 Gb
-----------End assembly log------------
[2023-05-16 02:20:32] root: INFO: >>>STAGE: contigger
[2023-05-16 02:20:32] root: INFO: Generating contigs
[2023-05-16 02:20:32] root: DEBUG: -----Begin contigger analyser log------
[2023-05-16 02:20:32] root: DEBUG: Running: flye-modules contigger --graph-edges /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/20-repeat/repeat_graph_edges.fasta --reads /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/final_filtered_long_reads.fastq.gz --out-dir /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/30-contigger --config /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg --repeat-graph /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/20-repeat/repeat_graph_dump --graph-aln /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/20-repeat/read_alignment_dump --log /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/flye.log --threads 1 --min-ovlp 4000
[2023-05-16 02:20:32] DEBUG: Build date: May 13 2023 06:31:26
[2023-05-16 02:20:32] DEBUG: Total RAM: 62 Gb
[2023-05-16 02:20:32] DEBUG: Available RAM: 54 Gb
[2023-05-16 02:20:32] DEBUG: Total CPUs: 16
[2023-05-16 02:20:32] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg
[2023-05-16 02:20:32] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_defaults.cfg
[2023-05-16 02:20:32] DEBUG: big_genome_threshold=29000000
[2023-05-16 02:20:32] DEBUG: meta_read_filter_kmer_freq=100
[2023-05-16 02:20:32] DEBUG: chain_large_gap_penalty=2
[2023-05-16 02:20:32] DEBUG: chain_small_gap_penalty=0.5
[2023-05-16 02:20:32] DEBUG: chain_gap_jump_threshold=100
[2023-05-16 02:20:32] DEBUG: max_coverage_drop_rate=5
[2023-05-16 02:20:32] DEBUG: max_extensions_drop_rate=5
[2023-05-16 02:20:32] DEBUG: chimera_window=100
[2023-05-16 02:20:32] DEBUG: chimera_overhang=1000
[2023-05-16 02:20:32] DEBUG: min_reads_in_disjointig=4
[2023-05-16 02:20:32] DEBUG: max_inner_reads=10
[2023-05-16 02:20:32] DEBUG: max_inner_fraction=0.25
[2023-05-16 02:20:32] DEBUG: max_separation=500
[2023-05-16 02:20:32] DEBUG: unique_edge_length=50000
[2023-05-16 02:20:32] DEBUG: min_repeat_res_support=0.51
[2023-05-16 02:20:32] DEBUG: out_paths_ratio=5
[2023-05-16 02:20:32] DEBUG: graph_cov_drop_rate=5
[2023-05-16 02:20:32] DEBUG: coverage_estimate_window=100
[2023-05-16 02:20:32] DEBUG: max_bubble_length=50000
[2023-05-16 02:20:32] DEBUG: loop_coverage_rate=1.5
[2023-05-16 02:20:32] DEBUG: repeat_edge_cov_mult=1.75
[2023-05-16 02:20:32] DEBUG: weak_detach_rate=5
[2023-05-16 02:20:32] DEBUG: tip_coverage_rate=2
[2023-05-16 02:20:32] DEBUG: tip_length_rate=2
[2023-05-16 02:20:32] DEBUG: output_gfa_before_rr=0
[2023-05-16 02:20:32] DEBUG: remove_alt_edges=0
[2023-05-16 02:20:32] DEBUG: low_cutoff_warning=1
[2023-05-16 02:20:32] DEBUG: kmer_size=17
[2023-05-16 02:20:32] DEBUG: use_minimizers=0
[2023-05-16 02:20:32] DEBUG: reads_base_alignment=0
[2023-05-16 02:20:32] DEBUG: meta_read_top_kmer_rate=0.40
[2023-05-16 02:20:32] DEBUG: maximum_jump=1500
[2023-05-16 02:20:32] DEBUG: maximum_overhang=1500
[2023-05-16 02:20:32] DEBUG: repeat_kmer_rate=100
[2023-05-16 02:20:32] DEBUG: assemble_ovlp_divergence=0.10
[2023-05-16 02:20:32] DEBUG: assemble_divergence_relative=1
[2023-05-16 02:20:32] DEBUG: repeat_graph_ovlp_divergence=0.08
[2023-05-16 02:20:32] DEBUG: read_align_ovlp_divergence=0.25
[2023-05-16 02:20:32] DEBUG: hpc_scoring_on=0
[2023-05-16 02:20:32] DEBUG: add_unassembled_reads=0
[2023-05-16 02:20:32] DEBUG: extend_contigs_with_repeats=0
[2023-05-16 02:20:32] DEBUG: min_read_cov_cutoff=3
[2023-05-16 02:20:32] DEBUG: short_tip_length=20000
[2023-05-16 02:20:32] DEBUG: long_tip_length=100000
[2023-05-16 02:20:32] DEBUG: Running with k-mer size: 17
[2023-05-16 02:20:32] DEBUG: Selected minimum overlap 4000
[2023-05-16 02:20:32] INFO: Reading sequences
[2023-05-16 02:20:37] DEBUG: Building positional index
[2023-05-16 02:20:37] DEBUG: Total sequence: 278879070 bp
[2023-05-16 02:20:37] DEBUG: Flipped 0
[2023-05-16 02:20:37] DEBUG: UPath 1: -24 -> 9 -> 21 -> -13 -> 20 -> -10 -> 23 -> 17 -> 18 -> -26 -> -8 -> -28 -> -7 -> -22 -> -6 -> 25 -> 11 -> -27 -> 5
[2023-05-16 02:20:37] DEBUG: UPath 2: 14
[2023-05-16 02:20:37] DEBUG: UPath 3: 19
[2023-05-16 02:20:37] DEBUG: Final graph contains 3 egdes
[2023-05-16 02:20:37] DEBUG: Extending contigs into repeats
[2023-05-16 02:20:37] DEBUG: Covered 0 repetitive contigs
[2023-05-16 02:20:37] INFO: Generated 3 contigs
[2023-05-16 02:20:37] DEBUG: Writing FASTA
[2023-05-16 02:20:37] DEBUG: Generating scaffold connections
[2023-05-16 02:20:37] INFO: Added 0 scaffold connections
[2023-05-16 02:20:37] DEBUG: Writing Dot
[2023-05-16 02:20:37] DEBUG: Writing FASTA
[2023-05-16 02:20:37] DEBUG: Writing Gfa
[2023-05-16 02:20:37] DEBUG: Peak RAM usage: 0 Gb
-----------End assembly log------------
[2023-05-16 02:20:37] root: INFO: >>>STAGE: polishing
[2023-05-16 02:20:37] root: INFO: Polishing genome (1/1)
[2023-05-16 02:20:37] root: INFO: Running minimap2
[2023-05-16 02:22:55] root: INFO: Separating alignment into bubbles
[2023-05-16 02:27:18] root: DEBUG: Generated 328532 bubbles
[2023-05-16 02:27:18] root: DEBUG: Split 5 long bubbles
[2023-05-16 02:27:18] root: DEBUG: Skipped 0 empty bubbles
[2023-05-16 02:27:18] root: DEBUG: Skipped 1 bubbles with long branches
[2023-05-16 02:27:18] root: INFO: Alignment error rate: 0.064269
[2023-05-16 02:27:18] root: INFO: Correcting bubbles
[2023-05-16 02:31:34] root: DEBUG: Mean contig coverage: 56, selected threshold: 11
[2023-05-16 02:31:34] root: DEBUG: Filtered 0 contigs of total length 0
[2023-05-16 02:31:34] root: DEBUG: Generating polished GFA
[2023-05-16 02:31:38] root: DEBUG: 0 sequences remained unpolished
[2023-05-16 02:31:38] root: INFO: >>>STAGE: finalize
[2023-05-16 02:31:38] root: DEBUG: ---Output dir contents:----
[2023-05-16 02:31:38] root: DEBUG: Citrobacter_koseri_MINF/
[2023-05-16 02:31:38] root: DEBUG: 4.0 M assembly.fasta
[2023-05-16 02:31:38] root: DEBUG: 489.0 B assembly_graph.gv
[2023-05-16 02:31:38] root: DEBUG: 92.0 B params.json
[2023-05-16 02:31:38] root: DEBUG: 4.0 M assembly_graph.gfa
[2023-05-16 02:31:38] root: DEBUG: 36.0 K flye.log
[2023-05-16 02:31:38] root: DEBUG: 279.0 M final_filtered_long_reads.fastq.gz
[2023-05-16 02:31:38] root: DEBUG: 279.0 M chopper_long_reads.fastq.gz
[2023-05-16 02:31:38] root: DEBUG: 1.0 K Citrobacter_koseri_MINF_05162023_015813.log
[2023-05-16 02:31:38] root: DEBUG: 40-polishing/
[2023-05-16 02:31:38] root: DEBUG: 4.0 M polished_edges.gfa
[2023-05-16 02:31:38] root: DEBUG: 84.0 B filtered_contigs.fasta.fai
[2023-05-16 02:31:38] root: DEBUG: 1.0 K minimap.stderr
[2023-05-16 02:31:38] root: DEBUG: 84.0 B contigs_stats.txt
[2023-05-16 02:31:38] root: DEBUG: 2.0 K edges_aln.bam.bai
[2023-05-16 02:31:38] root: DEBUG: 1.0 M base_coverage.bed.gz
[2023-05-16 02:31:38] root: DEBUG: 84.0 B filtered_stats.txt
[2023-05-16 02:31:38] root: DEBUG: 120.0 K minimap_1.bam.bai
[2023-05-16 02:31:38] root: DEBUG: 4.0 M filtered_contigs.fasta
[2023-05-16 02:31:38] root: DEBUG: 30-contigger/
[2023-05-16 02:31:38] root: DEBUG: 0.0 B scaffolds_links.txt
[2023-05-16 02:31:38] root: DEBUG: 489.0 B graph_final.gv
[2023-05-16 02:31:38] root: DEBUG: 180.0 B contigs_stats.txt
[2023-05-16 02:31:38] root: DEBUG: 4.0 M graph_final.fasta
[2023-05-16 02:31:38] root: DEBUG: 4.0 M graph_final.gfa
[2023-05-16 02:31:38] root: DEBUG: 84.0 B contigs.fasta.fai
[2023-05-16 02:31:38] root: DEBUG: 4.0 M contigs.fasta
[2023-05-16 02:31:38] root: DEBUG: 10-consensus/
[2023-05-16 02:31:38] root: DEBUG: 1.0 K minimap.stderr
[2023-05-16 02:31:38] root: DEBUG: 127.0 K minimap.bam.bai
[2023-05-16 02:31:38] root: DEBUG: 4.0 M consensus.fasta
[2023-05-16 02:31:38] root: DEBUG: 00-assembly/
[2023-05-16 02:31:38] root: DEBUG: 97.0 B draft_assembly.fasta.fai
[2023-05-16 02:31:38] root: DEBUG: 4.0 M draft_assembly.fasta
[2023-05-16 02:31:38] root: DEBUG: 20-repeat/
[2023-05-16 02:31:38] root: DEBUG: 4.0 K repeat_graph_dump
[2023-05-16 02:31:38] root: DEBUG: 4.0 M repeat_graph_edges.fasta
[2023-05-16 02:31:38] root: DEBUG: 1.0 K graph_before_rr.gv
[2023-05-16 02:31:38] root: DEBUG: 2.0 K graph_after_rr.gv
[2023-05-16 02:31:38] root: DEBUG: 4.0 M graph_before_rr.fasta
[2023-05-16 02:31:38] root: DEBUG: 5.0 M read_alignment_dump
[2023-05-16 02:31:38] root: DEBUG: --------------------------
[2023-05-16 02:31:38] root: INFO: Assembly statistics:
Total length: 4840901
Fragments: 3
Fragments N50: 4757416
Largest frg: 4757416
Scaffolds: 0
Mean coverage: 56
[2023-05-16 02:31:38] root: INFO: Final assembly: /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/assembly.fasta
8 Threads
[2023-05-16 03:04:52] root: INFO: Starting Flye 2.9.2-b1786
[2023-05-16 03:04:52] root: DEBUG: Cmd: /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/bin/flye --nano-raw ../real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/final_filtered_long_reads.fastq.gz --out-dir ../real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF --threads 8 --deterministic
[2023-05-16 03:04:52] root: DEBUG: Python version: 3.7.8 | packaged by conda-forge | (default, Jul 31 2020, 02:25:08)
[GCC 7.5.0]
[2023-05-16 03:04:52] root: INFO: >>>STAGE: configure
[2023-05-16 03:04:52] root: INFO: Configuring run
[2023-05-16 03:04:57] root: INFO: Total read length: 278879070
[2023-05-16 03:04:57] root: INFO: Reads N50/N90: 16438 / 4092
[2023-05-16 03:04:57] root: INFO: Minimum overlap set to 4000
[2023-05-16 03:04:57] root: INFO: >>>STAGE: assembly
[2023-05-16 03:04:57] root: INFO: Assembling disjointigs
[2023-05-16 03:04:57] root: DEBUG: -----Begin assembly log------
[2023-05-16 03:04:57] root: DEBUG: Running: flye-modules assemble --reads /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/final_filtered_long_reads.fastq.gz --out-asm /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/00-assembly/draft_assembly.fasta --config /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg --log /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/flye.log --threads 1 --min-ovlp 4000
[2023-05-16 03:04:57] DEBUG: Build date: May 13 2023 06:30:35
[2023-05-16 03:04:57] DEBUG: Total RAM: 62 Gb
[2023-05-16 03:04:57] DEBUG: Available RAM: 58 Gb
[2023-05-16 03:04:57] DEBUG: Total CPUs: 16
[2023-05-16 03:04:57] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg
[2023-05-16 03:04:57] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_defaults.cfg
[2023-05-16 03:04:57] DEBUG: big_genome_threshold=29000000
[2023-05-16 03:04:57] DEBUG: meta_read_filter_kmer_freq=100
[2023-05-16 03:04:57] DEBUG: chain_large_gap_penalty=2
[2023-05-16 03:04:57] DEBUG: chain_small_gap_penalty=0.5
[2023-05-16 03:04:57] DEBUG: chain_gap_jump_threshold=100
[2023-05-16 03:04:57] DEBUG: max_coverage_drop_rate=5
[2023-05-16 03:04:57] DEBUG: max_extensions_drop_rate=5
[2023-05-16 03:04:57] DEBUG: chimera_window=100
[2023-05-16 03:04:57] DEBUG: chimera_overhang=1000
[2023-05-16 03:04:57] DEBUG: min_reads_in_disjointig=4
[2023-05-16 03:04:57] DEBUG: max_inner_reads=10
[2023-05-16 03:04:57] DEBUG: max_inner_fraction=0.25
[2023-05-16 03:04:57] DEBUG: max_separation=500
[2023-05-16 03:04:57] DEBUG: unique_edge_length=50000
[2023-05-16 03:04:57] DEBUG: min_repeat_res_support=0.51
[2023-05-16 03:04:57] DEBUG: out_paths_ratio=5
[2023-05-16 03:04:57] DEBUG: graph_cov_drop_rate=5
[2023-05-16 03:04:57] DEBUG: coverage_estimate_window=100
[2023-05-16 03:04:57] DEBUG: max_bubble_length=50000
[2023-05-16 03:04:57] DEBUG: loop_coverage_rate=1.5
[2023-05-16 03:04:57] DEBUG: repeat_edge_cov_mult=1.75
[2023-05-16 03:04:57] DEBUG: weak_detach_rate=5
[2023-05-16 03:04:57] DEBUG: tip_coverage_rate=2
[2023-05-16 03:04:57] DEBUG: tip_length_rate=2
[2023-05-16 03:04:57] DEBUG: output_gfa_before_rr=0
[2023-05-16 03:04:57] DEBUG: remove_alt_edges=0
[2023-05-16 03:04:57] DEBUG: low_cutoff_warning=1
[2023-05-16 03:04:57] DEBUG: kmer_size=17
[2023-05-16 03:04:57] DEBUG: use_minimizers=0
[2023-05-16 03:04:57] DEBUG: reads_base_alignment=0
[2023-05-16 03:04:57] DEBUG: meta_read_top_kmer_rate=0.40
[2023-05-16 03:04:57] DEBUG: maximum_jump=1500
[2023-05-16 03:04:57] DEBUG: maximum_overhang=1500
[2023-05-16 03:04:57] DEBUG: repeat_kmer_rate=100
[2023-05-16 03:04:57] DEBUG: assemble_ovlp_divergence=0.10
[2023-05-16 03:04:57] DEBUG: assemble_divergence_relative=1
[2023-05-16 03:04:57] DEBUG: repeat_graph_ovlp_divergence=0.08
[2023-05-16 03:04:57] DEBUG: read_align_ovlp_divergence=0.25
[2023-05-16 03:04:57] DEBUG: hpc_scoring_on=0
[2023-05-16 03:04:57] DEBUG: add_unassembled_reads=0
[2023-05-16 03:04:57] DEBUG: extend_contigs_with_repeats=0
[2023-05-16 03:04:57] DEBUG: min_read_cov_cutoff=3
[2023-05-16 03:04:57] DEBUG: short_tip_length=20000
[2023-05-16 03:04:57] DEBUG: long_tip_length=100000
[2023-05-16 03:04:57] DEBUG: Running with k-mer size: 17
[2023-05-16 03:04:57] DEBUG: Running with minimum overlap 4000
[2023-05-16 03:04:57] DEBUG: Metagenome mode: N
[2023-05-16 03:04:57] DEBUG: Short mode: N
[2023-05-16 03:04:57] INFO: Reading sequences
[2023-05-16 03:05:00] DEBUG: Building positional index
[2023-05-16 03:05:00] DEBUG: Total sequence: 251729388 bp
[2023-05-16 03:05:03] INFO: Counting k-mers:
[2023-05-16 03:05:52] DEBUG: Updating k-mer histogram
[2023-05-16 03:06:46] DEBUG: Hash size: 4622497
[2023-05-16 03:06:46] DEBUG: Total k-mers 88307420
[2023-05-16 03:06:46] INFO: Filling index table (1/2)
[2023-05-16 03:08:38] DEBUG: Mean k-mer frequency: 21.8363
[2023-05-16 03:08:38] DEBUG: Repetitive k-mer frequency: 2183
[2023-05-16 03:08:38] DEBUG: Filtered 18624 repetitive k-mers (0.000182569)
[2023-05-16 03:08:39] INFO: Filling index table (2/2)
[2023-05-16 03:10:36] DEBUG: Sorting k-mer index
[2023-05-16 03:10:36] DEBUG: Selected k-mers: 6031079
[2023-05-16 03:10:36] DEBUG: Index size: 103351604
[2023-05-16 03:10:36] DEBUG: Mean k-mer index frequency: 17.1365
[2023-05-16 03:10:36] DEBUG: Peak RAM usage: 9 Gb
[2023-05-16 03:10:36] DEBUG: Estimating k-mer identity bias
[2023-05-16 03:11:26] DEBUG: Initial divergence estimate : 0.0717829
[2023-05-16 03:11:26] DEBUG: Relative threshold: Y
[2023-05-16 03:11:26] DEBUG: Max divergence threshold set to 0.171783
[2023-05-16 03:11:26] INFO: Extending reads
[2023-05-16 03:11:26] DEBUG: Estimating overlap coverage
[2023-05-16 03:12:18] INFO: Overlap-based coverage: 49
[2023-05-16 03:12:18] INFO: Median overlap divergence: 0.0716492
[2023-05-16 03:12:18] DEBUG: Sequence divergence distribution:
| * |
| ** |
| ** |
| ** |
| ** |
| *** |
| *** |
| **** |
| ***** |
| ***** |
| ***** |
| ***** |
| ****** |
| ****** |
| ******* |
| ********** |
| *********** |
| ************ |
| ************** |
| * ********************** * ** * ** * * *
----------------------------------------------------------------------------------------------------
0% 5% 10% 15% 20% 25% 30% 35% 40% 45%
Q25 = 0.064, Q50 = 0.072, Q75 = 0.082
[2023-05-16 03:13:27] DEBUG: Assembled disjointig 1
With 373 reads
Start read: +d0f225cb-3a25-4d2c-b63a-e5a8af2de32c
At position: 369
leftTip: 0 rightTip: 0
Suspicious: 1
Short ext: 1
Mean extensions: 39
Avg overlap len: 38582
Min overlap len: 2009
Inner reads: 0
Length: 4797752
[2023-05-16 03:13:27] DEBUG: Inner: 32900 covered: 32974 total: 35900
[2023-05-16 03:13:35] DEBUG: Assembled disjointig 2
With 21 reads
Start read: +9f6116ca-576f-4d7f-b606-f3289ccd4a8f
At position: 8
leftTip: 0 rightTip: 0
Suspicious: 1
Short ext: 1
Mean extensions: 61
Avg overlap len: 56389
Min overlap len: 2220
Inner reads: 0
Length: 188876
[2023-05-16 03:13:35] DEBUG: Inner: 33808 covered: 33890 total: 35900
[2023-05-16 03:13:37] DEBUG: Assembled disjointig 3
With 21 reads
Start read: +a57a2ff9-5e67-4ab8-be97-2b06a7698c24
At position: 7
leftTip: 0 rightTip: 0
Suspicious: 0
Short ext: 0
Mean extensions: 66
Avg overlap len: 7517
Min overlap len: 6171
Inner reads: 0
Length: 42899
[2023-05-16 03:13:37] DEBUG: Inner: 34308 covered: 34392 total: 35900
[2023-05-16 03:13:57] INFO: Assembled 3 disjointigs
[2023-05-16 03:13:57] INFO: Generating sequence
[2023-05-16 03:14:03] DEBUG: Building positional index
[2023-05-16 03:14:03] DEBUG: Total sequence: 5030425 bp
[2023-05-16 03:14:05] DEBUG: Mean k-mer frequency: 1.03893
[2023-05-16 03:14:05] DEBUG: Repetitive k-mer frequency: 103
[2023-05-16 03:14:05] DEBUG: Filtered 0 repetitive k-mers (0)
[2023-05-16 03:14:06] DEBUG: Sorting k-mer index
[2023-05-16 03:14:07] DEBUG: Selected k-mers: 4841871
[2023-05-16 03:14:07] DEBUG: K-mer index size: 5030374
[2023-05-16 03:14:07] DEBUG: Mean k-mer frequency: 1.03893
[2023-05-16 03:14:07] DEBUG: Minimizer rate: 1.00001
[2023-05-16 03:14:07] INFO: Filtering contained disjointigs
[2023-05-16 03:14:09] DEBUG: Computing transitive closure for overlaps
[2023-05-16 03:14:09] DEBUG: Found 12 overlaps
[2023-05-16 03:14:09] DEBUG: Left 12 overlaps after filtering
[2023-05-16 03:14:09] INFO: Contained seqs: 0
[2023-05-16 03:14:09] DEBUG: Writing FASTA
[2023-05-16 03:14:09] DEBUG: Peak RAM usage: 9 Gb
-----------End assembly log------------
[2023-05-16 03:14:09] root: DEBUG: Disjointigs length: 5030425, N50: 4797882
[2023-05-16 03:14:09] root: INFO: >>>STAGE: consensus
[2023-05-16 03:14:09] root: INFO: Running Minimap2
[2023-05-16 03:14:35] root: INFO: Computing consensus
[2023-05-16 03:15:42] root: INFO: Alignment error rate: 0.104514
[2023-05-16 03:15:42] root: INFO: >>>STAGE: repeat
[2023-05-16 03:15:42] root: INFO: Building and resolving repeat graph
[2023-05-16 03:15:42] root: DEBUG: -----Begin repeat analyser log------
[2023-05-16 03:15:42] root: DEBUG: Running: flye-modules repeat --disjointigs /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/10-consensus/consensus.fasta --reads /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/final_filtered_long_reads.fastq.gz --out-dir /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/20-repeat --config /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg --log /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/flye.log --threads 8 --min-ovlp 4000
[2023-05-16 03:15:42] DEBUG: Build date: May 13 2023 06:30:59
[2023-05-16 03:15:42] DEBUG: Total RAM: 62 Gb
[2023-05-16 03:15:42] DEBUG: Available RAM: 58 Gb
[2023-05-16 03:15:42] DEBUG: Total CPUs: 16
[2023-05-16 03:15:42] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg
[2023-05-16 03:15:42] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_defaults.cfg
[2023-05-16 03:15:42] DEBUG: big_genome_threshold=29000000
[2023-05-16 03:15:42] DEBUG: meta_read_filter_kmer_freq=100
[2023-05-16 03:15:42] DEBUG: chain_large_gap_penalty=2
[2023-05-16 03:15:42] DEBUG: chain_small_gap_penalty=0.5
[2023-05-16 03:15:42] DEBUG: chain_gap_jump_threshold=100
[2023-05-16 03:15:42] DEBUG: max_coverage_drop_rate=5
[2023-05-16 03:15:42] DEBUG: max_extensions_drop_rate=5
[2023-05-16 03:15:42] DEBUG: chimera_window=100
[2023-05-16 03:15:42] DEBUG: chimera_overhang=1000
[2023-05-16 03:15:42] DEBUG: min_reads_in_disjointig=4
[2023-05-16 03:15:42] DEBUG: max_inner_reads=10
[2023-05-16 03:15:42] DEBUG: max_inner_fraction=0.25
[2023-05-16 03:15:42] DEBUG: max_separation=500
[2023-05-16 03:15:42] DEBUG: unique_edge_length=50000
[2023-05-16 03:15:42] DEBUG: min_repeat_res_support=0.51
[2023-05-16 03:15:42] DEBUG: out_paths_ratio=5
[2023-05-16 03:15:42] DEBUG: graph_cov_drop_rate=5
[2023-05-16 03:15:42] DEBUG: coverage_estimate_window=100
[2023-05-16 03:15:42] DEBUG: max_bubble_length=50000
[2023-05-16 03:15:42] DEBUG: loop_coverage_rate=1.5
[2023-05-16 03:15:42] DEBUG: repeat_edge_cov_mult=1.75
[2023-05-16 03:15:42] DEBUG: weak_detach_rate=5
[2023-05-16 03:15:42] DEBUG: tip_coverage_rate=2
[2023-05-16 03:15:42] DEBUG: tip_length_rate=2
[2023-05-16 03:15:42] DEBUG: output_gfa_before_rr=0
[2023-05-16 03:15:42] DEBUG: remove_alt_edges=0
[2023-05-16 03:15:42] DEBUG: low_cutoff_warning=1
[2023-05-16 03:15:42] DEBUG: kmer_size=17
[2023-05-16 03:15:42] DEBUG: use_minimizers=0
[2023-05-16 03:15:42] DEBUG: reads_base_alignment=0
[2023-05-16 03:15:42] DEBUG: meta_read_top_kmer_rate=0.40
[2023-05-16 03:15:42] DEBUG: maximum_jump=1500
[2023-05-16 03:15:42] DEBUG: maximum_overhang=1500
[2023-05-16 03:15:42] DEBUG: repeat_kmer_rate=100
[2023-05-16 03:15:42] DEBUG: assemble_ovlp_divergence=0.10
[2023-05-16 03:15:42] DEBUG: assemble_divergence_relative=1
[2023-05-16 03:15:42] DEBUG: repeat_graph_ovlp_divergence=0.08
[2023-05-16 03:15:42] DEBUG: read_align_ovlp_divergence=0.25
[2023-05-16 03:15:42] DEBUG: hpc_scoring_on=0
[2023-05-16 03:15:42] DEBUG: add_unassembled_reads=0
[2023-05-16 03:15:42] DEBUG: extend_contigs_with_repeats=0
[2023-05-16 03:15:42] DEBUG: min_read_cov_cutoff=3
[2023-05-16 03:15:42] DEBUG: short_tip_length=20000
[2023-05-16 03:15:42] DEBUG: long_tip_length=100000
[2023-05-16 03:15:42] DEBUG: Running with k-mer size: 17
[2023-05-16 03:15:42] DEBUG: Selected minimum overlap 4000
[2023-05-16 03:15:42] DEBUG: Metagenome mode: N
[2023-05-16 03:15:42] INFO: Parsing disjointigs
[2023-05-16 03:15:42] DEBUG: Building positional index
[2023-05-16 03:15:42] DEBUG: Total sequence: 5076617 bp
[2023-05-16 03:15:42] INFO: Building repeat graph
[2023-05-16 03:15:44] DEBUG: Mean k-mer frequency: 1.06798
[2023-05-16 03:15:44] DEBUG: Repetitive k-mer frequency: 106
[2023-05-16 03:15:44] DEBUG: Filtered 266 repetitive k-mers (5.23976e-05)
[2023-05-16 03:15:45] DEBUG: Sorting k-mer index
[2023-05-16 03:15:45] DEBUG: Selected k-mers: 4753412
[2023-05-16 03:15:45] DEBUG: K-mer index size: 5076300
[2023-05-16 03:15:45] DEBUG: Mean k-mer frequency: 1.06793
[2023-05-16 03:15:45] DEBUG: Minimizer rate: 1.00006
[2023-05-16 03:15:48] DEBUG: Computing transitive closure for overlaps
[2023-05-16 03:15:48] DEBUG: Found 252 overlaps
[2023-05-16 03:15:48] DEBUG: Left 100 overlaps after filtering
[2023-05-16 03:15:48] INFO: Median overlap divergence: 0.0385451
[2023-05-16 03:15:48] DEBUG: Sequence divergence distribution:
|* |
|* |
|* |
|* |
|* |
|* * |
|* * |
|* * |
|* * |
|* * |
|* * * * * | *
|* * * * * | *
|* * * * * | *
|* * * * * | *
|* * * * * | *
|* *** * ** * | *
|* *** * ** * | *
|* *** * ** * | *
|* *** * ** * | *
|* *** * ** * | *
----------------------------------------------------------------------------------------------------
0% 5% 10% 15% 20% 25% 30% 35% 40% 45%
Q25 = 0.015, Q50 = 0.039, Q75 = 0.052
[2023-05-16 03:15:48] DEBUG: Computing gluepoints
[2023-05-16 03:15:48] DEBUG: Added 0 gluepoint projections
[2023-05-16 03:15:48] DEBUG: Created 68 gluepoints
[2023-05-16 03:15:48] DEBUG: Artificial loops removed: 0 left, 0 right, 0 both
[2023-05-16 03:15:48] DEBUG: Initializing edges
[2023-05-16 03:15:48] DEBUG: Edges length checksum: 2273136170
[2023-05-16 03:15:48] DEBUG: Filtered 0 singleton segments
[2023-05-16 03:15:48] DEBUG: Removed 0 simple and 0 double chimeric junctions
[2023-05-16 03:15:48] DEBUG: Collapsed 3 edges
[2023-05-16 03:15:48] DEBUG: * 19 +disjointig_1 91498 612411 520913
[2023-05-16 03:15:48] DEBUG: 4 +disjointig_1 612411 617825 5414
[2023-05-16 03:15:48] DEBUG: * 10 +disjointig_1 617825 1354882 737057
[2023-05-16 03:15:48] DEBUG: 12 +disjointig_1 1354882 1383671 28789
[2023-05-16 03:15:48] DEBUG: * -13 +disjointig_1 1383671 1661038 277367
[2023-05-16 03:15:48] DEBUG: 12 +disjointig_1 1661038 1689822 28784
[2023-05-16 03:15:48] DEBUG: * -11 +disjointig_1 1689822 3220286 1530464
[2023-05-16 03:15:48] DEBUG: -4 +disjointig_1 3220286 3225683 5397
[2023-05-16 03:15:48] DEBUG: * 6 +disjointig_1 3225683 3934891 709208
[2023-05-16 03:15:48] DEBUG: -4 +disjointig_1 3934891 3940023 5132
[2023-05-16 03:15:48] DEBUG: * 7 +disjointig_1 3940023 3972909 32886
[2023-05-16 03:15:48] DEBUG: -4 +disjointig_1 3972909 3978219 5310
[2023-05-16 03:15:48] DEBUG: * 8 +disjointig_1 3978219 4103369 125150
[2023-05-16 03:15:48] DEBUG: -4 +disjointig_1 4103369 4108685 5316
[2023-05-16 03:15:48] DEBUG: * 9 +disjointig_1 4108685 4193387 84702
[2023-05-16 03:15:48] DEBUG: -4 +disjointig_1 4193387 4198537 5150
[2023-05-16 03:15:48] DEBUG: * 5 +disjointig_1 4198537 4684469 485932
[2023-05-16 03:15:48] DEBUG: 4 +disjointig_1 4684469 4689761 5292
[2023-05-16 03:15:48] DEBUG: * 18 +disjointig_1 4689761 4843327 153566
[2023-05-16 03:15:48] DEBUG: 20 +disjointig_2 61224 190410 129186
[2023-05-16 03:15:48] DEBUG: 21 +disjointig_3 6269 42863 36594
[2023-05-16 03:15:48] DEBUG: Total edges: 21
[2023-05-16 03:15:48] INFO: Parsing reads
[2023-05-16 03:15:51] DEBUG: Building positional index
[2023-05-16 03:15:51] DEBUG: Total sequence: 278879070 bp
[2023-05-16 03:15:51] DEBUG: Building positional index
[2023-05-16 03:15:51] DEBUG: Total sequence: 4917609 bp
[2023-05-16 03:15:51] INFO: Aligning reads to the graph
[2023-05-16 03:15:52] DEBUG: Mean k-mer frequency: 1.03575
[2023-05-16 03:15:52] DEBUG: Repetitive k-mer frequency: 103
[2023-05-16 03:15:52] DEBUG: Filtered 262 repetitive k-mers (5.32822e-05)
[2023-05-16 03:15:53] DEBUG: Sorting k-mer index
[2023-05-16 03:15:53] DEBUG: Selected k-mers: 4747508
[2023-05-16 03:15:53] DEBUG: K-mer index size: 4916956
[2023-05-16 03:15:53] DEBUG: Mean k-mer frequency: 1.03569
[2023-05-16 03:15:53] DEBUG: Minimizer rate: 1.00013
[2023-05-16 03:16:14] DEBUG: Total reads : 17950
[2023-05-16 03:16:14] DEBUG: Read with aligned parts : 17458
[2023-05-16 03:16:14] DEBUG: Aligned in one piece : 17383
[2023-05-16 03:16:14] INFO: Aligned read sequence: 243823657 / 251729388 (0.968594)
[2023-05-16 03:16:14] INFO: Median overlap divergence: 0.0275118
[2023-05-16 03:16:14] DEBUG: Sequence divergence distribution:
| * |
| * |
| * |
| * |
| ** |
| ** |
| ** |
| ** |
| *** |
| **** |
| **** |
| **** |
| ***** |
| ***** |
| ***** |
| ****** |
| ******* |
| ******** |
| ********** |
| ******************** ********************************* * *
----------------------------------------------------------------------------------------------------
0% 5% 10% 15% 20% 25% 30% 35% 40% 45%
Q25 = 0.022, Q50 = 0.028, Q75 = 0.037
[2023-05-16 03:16:14] INFO: Mean edge coverage: 51
[2023-05-16 03:16:14] DEBUG: 4 len:5287 cov:398 mult:7.80392
[2023-05-16 03:16:14] DEBUG: -4 len:5287 cov:398 mult:7.80392
[2023-05-16 03:16:14] DEBUG: 5 len:485932 cov:56 mult:1.09804
[2023-05-16 03:16:14] DEBUG: -5 len:485932 cov:56 mult:1.09804
[2023-05-16 03:16:14] DEBUG: 6 len:709208 cov:49 mult:0.960784
[2023-05-16 03:16:14] DEBUG: -6 len:709208 cov:49 mult:0.960784
[2023-05-16 03:16:14] DEBUG: 7 len:32886 cov:60 mult:1.17647
[2023-05-16 03:16:14] DEBUG: -7 len:32886 cov:60 mult:1.17647
[2023-05-16 03:16:14] DEBUG: 8 len:125150 cov:55 mult:1.07843
[2023-05-16 03:16:14] DEBUG: -8 len:125150 cov:55 mult:1.07843
[2023-05-16 03:16:14] DEBUG: 9 len:84702 cov:51 mult:1
[2023-05-16 03:16:14] DEBUG: -9 len:84702 cov:51 mult:1
[2023-05-16 03:16:14] DEBUG: 10 len:737057 cov:48 mult:0.941176
[2023-05-16 03:16:14] DEBUG: -10 len:737057 cov:48 mult:0.941176
[2023-05-16 03:16:14] DEBUG: 11 len:1530464 cov:47 mult:0.921569
[2023-05-16 03:16:14] DEBUG: -11 len:1530464 cov:47 mult:0.921569
[2023-05-16 03:16:14] DEBUG: 12 len:28786 cov:101 mult:1.98039
[2023-05-16 03:16:14] DEBUG: -12 len:28786 cov:101 mult:1.98039
[2023-05-16 03:16:14] DEBUG: 13 len:277367 cov:46 mult:0.901961
[2023-05-16 03:16:14] DEBUG: -13 len:277367 cov:46 mult:0.901961
[2023-05-16 03:16:14] DEBUG: 18 len:153566 cov:51 mult:1
[2023-05-16 03:16:14] DEBUG: -18 len:153566 cov:51 mult:1
[2023-05-16 03:16:14] DEBUG: 19 len:520913 cov:50 mult:0.980392
[2023-05-16 03:16:14] DEBUG: -19 len:520913 cov:50 mult:0.980392
[2023-05-16 03:16:14] DEBUG: 20 len:64593 cov:112 mult:2.19608
[2023-05-16 03:16:14] DEBUG: -20 len:64593 cov:112 mult:2.19608
[2023-05-16 03:16:14] DEBUG: 21 len:18297 cov:109 mult:2.13725
[2023-05-16 03:16:14] DEBUG: -21 len:18297 cov:109 mult:2.13725
[2023-05-16 03:16:14] DEBUG: Unique coverage threshold 96
[2023-05-16 03:16:14] INFO: Simplifying the graph
[2023-05-16 03:16:14] DEBUG: Read coverage cutoff: 10
[2023-05-16 03:16:14] DEBUG: [SIMPL] Removed 0 paths with low coverage
[2023-05-16 03:16:14] DEBUG: [SIMPL] Masked 0 heterozygous loops
[2023-05-16 03:16:14] DEBUG: [SIMPL] Masked 0 simple bubbles
[2023-05-16 03:16:14] DEBUG: Finding repeats
[2023-05-16 03:16:14] DEBUG: Read coverage cutoff: 10
[2023-05-16 03:16:14] DEBUG: High-cov: 4 5287 398
[2023-05-16 03:16:14] DEBUG: High-cov: 12 28786 101
[2023-05-16 03:16:14] DEBUG: High-cov: 20 64593 112
[2023-05-16 03:16:14] DEBUG: High-cov: 21 18297 109
[2023-05-16 03:16:14] DEBUG: Repeat detection iteration 1
[2023-05-16 03:16:14] DEBUG: Writing Dot
[2023-05-16 03:16:14] DEBUG: Writing FASTA
[2023-05-16 03:16:14] DEBUG: [SIMPL] == Iteration 1 ==
[2023-05-16 03:16:14] DEBUG: Splitting nodes
[2023-05-16 03:16:14] DEBUG: [SIMPL] Split 0 nodes
[2023-05-16 03:16:14] DEBUG: [SIMPL] Clipped 0 short and 0 long tips
[2023-05-16 03:16:14] DEBUG: [SIMPL] Masked 0 heterozygous loops
[2023-05-16 03:16:14] DEBUG: [SIMPL] Masked 0 simple bubbles
[2023-05-16 03:16:14] DEBUG: Finding repeats
[2023-05-16 03:16:14] DEBUG: Read coverage cutoff: 10
[2023-05-16 03:16:14] DEBUG: High-cov: 4 5287 398
[2023-05-16 03:16:14] DEBUG: High-cov: 12 28786 101
[2023-05-16 03:16:14] DEBUG: High-cov: 20 64593 112
[2023-05-16 03:16:14] DEBUG: High-cov: 21 18297 109
[2023-05-16 03:16:14] DEBUG: Repeat detection iteration 1
[2023-05-16 03:16:14] DEBUG: Total unique edges: 10
[2023-05-16 03:16:14] DEBUG: Connection -13 -11 7 1
[2023-05-16 03:16:14] DEBUG: Connection 13 -10 3 1
[2023-05-16 03:16:14] DEBUG: Connection 8 9 33 1
[2023-05-16 03:16:14] DEBUG: Connection -11 6 29 1
[2023-05-16 03:16:14] DEBUG: Connection 19 10 25 1
[2023-05-16 03:16:14] DEBUG: Connection 9 5 23 1
[2023-05-16 03:16:14] DEBUG: Connection -7 -6 31 1
[2023-05-16 03:16:14] DEBUG: Connection -18 -5 29 1
[2023-05-16 03:16:14] DEBUG: Connection -8 -7 37 1
[2023-05-16 03:16:14] DEBUG: [SIMPL] Resolved repeats: 9
[2023-05-16 03:16:14] DEBUG: RR links: 434
[2023-05-16 03:16:14] DEBUG: Unresolved: 0
[2023-05-16 03:16:14] DEBUG: Removed 0 simple and 0 double chimeric junctions
[2023-05-16 03:16:14] DEBUG: [SIMPL] == Iteration 2 ==
[2023-05-16 03:16:14] DEBUG: Splitting nodes
[2023-05-16 03:16:14] DEBUG: [SIMPL] Split 0 nodes
[2023-05-16 03:16:14] DEBUG: [SIMPL] Clipped 0 short and 0 long tips
[2023-05-16 03:16:14] DEBUG: [SIMPL] Masked 0 heterozygous loops
[2023-05-16 03:16:14] DEBUG: [SIMPL] Masked 0 simple bubbles
[2023-05-16 03:16:14] DEBUG: Finding repeats
[2023-05-16 03:16:14] DEBUG: Read coverage cutoff: 10
[2023-05-16 03:16:14] DEBUG: High-cov: 20 64593 112
[2023-05-16 03:16:14] DEBUG: High-cov: 21 18297 109
[2023-05-16 03:16:14] DEBUG: Repeat detection iteration 1
[2023-05-16 03:16:14] DEBUG: Total unique edges: 19
[2023-05-16 03:16:14] DEBUG: [SIMPL] Resolved repeats: 0
[2023-05-16 03:16:14] DEBUG: RR links: 0
[2023-05-16 03:16:14] DEBUG: Unresolved: 0
[2023-05-16 03:16:14] DEBUG: Removed 0 simple and 0 double chimeric junctions
[2023-05-16 03:16:14] DEBUG: [SIMPL] Collapsed 0 haplotypes
[2023-05-16 03:16:14] DEBUG: [SIMPL] Resolved 0 simple repeats
[2023-05-16 03:16:14] DEBUG: Read coverage cutoff: 10
[2023-05-16 03:16:14] DEBUG: [SIMPL] Removed 0 paths with low coverage
[2023-05-16 03:16:14] DEBUG: Finding repeats
[2023-05-16 03:16:14] DEBUG: Read coverage cutoff: 10
[2023-05-16 03:16:14] DEBUG: High-cov: 20 64593 112
[2023-05-16 03:16:14] DEBUG: High-cov: 21 18297 109
[2023-05-16 03:16:14] DEBUG: Repeat detection iteration 1
[2023-05-16 03:16:14] DEBUG: Writing Dot
[2023-05-16 03:16:14] DEBUG: Writing FASTA
[2023-05-16 03:16:14] DEBUG: Peak RAM usage: 0 Gb
-----------End assembly log------------
[2023-05-16 03:16:14] root: INFO: >>>STAGE: contigger
[2023-05-16 03:16:14] root: INFO: Generating contigs
[2023-05-16 03:16:14] root: DEBUG: -----Begin contigger analyser log------
[2023-05-16 03:16:14] root: DEBUG: Running: flye-modules contigger --graph-edges /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/20-repeat/repeat_graph_edges.fasta --reads /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/final_filtered_long_reads.fastq.gz --out-dir /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/30-contigger --config /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg --repeat-graph /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/20-repeat/repeat_graph_dump --graph-aln /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/20-repeat/read_alignment_dump --log /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/flye.log --threads 8 --min-ovlp 4000
[2023-05-16 03:16:14] DEBUG: Build date: May 13 2023 06:31:26
[2023-05-16 03:16:14] DEBUG: Total RAM: 62 Gb
[2023-05-16 03:16:14] DEBUG: Available RAM: 58 Gb
[2023-05-16 03:16:14] DEBUG: Total CPUs: 16
[2023-05-16 03:16:14] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg
[2023-05-16 03:16:14] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_defaults.cfg
[2023-05-16 03:16:14] DEBUG: big_genome_threshold=29000000
[2023-05-16 03:16:14] DEBUG: meta_read_filter_kmer_freq=100
[2023-05-16 03:16:14] DEBUG: chain_large_gap_penalty=2
[2023-05-16 03:16:14] DEBUG: chain_small_gap_penalty=0.5
[2023-05-16 03:16:14] DEBUG: chain_gap_jump_threshold=100
[2023-05-16 03:16:14] DEBUG: max_coverage_drop_rate=5
[2023-05-16 03:16:14] DEBUG: max_extensions_drop_rate=5
[2023-05-16 03:16:14] DEBUG: chimera_window=100
[2023-05-16 03:16:14] DEBUG: chimera_overhang=1000
[2023-05-16 03:16:14] DEBUG: min_reads_in_disjointig=4
[2023-05-16 03:16:14] DEBUG: max_inner_reads=10
[2023-05-16 03:16:14] DEBUG: max_inner_fraction=0.25
[2023-05-16 03:16:14] DEBUG: max_separation=500
[2023-05-16 03:16:14] DEBUG: unique_edge_length=50000
[2023-05-16 03:16:14] DEBUG: min_repeat_res_support=0.51
[2023-05-16 03:16:14] DEBUG: out_paths_ratio=5
[2023-05-16 03:16:14] DEBUG: graph_cov_drop_rate=5
[2023-05-16 03:16:14] DEBUG: coverage_estimate_window=100
[2023-05-16 03:16:14] DEBUG: max_bubble_length=50000
[2023-05-16 03:16:14] DEBUG: loop_coverage_rate=1.5
[2023-05-16 03:16:14] DEBUG: repeat_edge_cov_mult=1.75
[2023-05-16 03:16:14] DEBUG: weak_detach_rate=5
[2023-05-16 03:16:14] DEBUG: tip_coverage_rate=2
[2023-05-16 03:16:14] DEBUG: tip_length_rate=2
[2023-05-16 03:16:14] DEBUG: output_gfa_before_rr=0
[2023-05-16 03:16:14] DEBUG: remove_alt_edges=0
[2023-05-16 03:16:14] DEBUG: low_cutoff_warning=1
[2023-05-16 03:16:14] DEBUG: kmer_size=17
[2023-05-16 03:16:14] DEBUG: use_minimizers=0
[2023-05-16 03:16:14] DEBUG: reads_base_alignment=0
[2023-05-16 03:16:14] DEBUG: meta_read_top_kmer_rate=0.40
[2023-05-16 03:16:14] DEBUG: maximum_jump=1500
[2023-05-16 03:16:14] DEBUG: maximum_overhang=1500
[2023-05-16 03:16:14] DEBUG: repeat_kmer_rate=100
[2023-05-16 03:16:14] DEBUG: assemble_ovlp_divergence=0.10
[2023-05-16 03:16:14] DEBUG: assemble_divergence_relative=1
[2023-05-16 03:16:14] DEBUG: repeat_graph_ovlp_divergence=0.08
[2023-05-16 03:16:14] DEBUG: read_align_ovlp_divergence=0.25
[2023-05-16 03:16:14] DEBUG: hpc_scoring_on=0
[2023-05-16 03:16:14] DEBUG: add_unassembled_reads=0
[2023-05-16 03:16:14] DEBUG: extend_contigs_with_repeats=0
[2023-05-16 03:16:14] DEBUG: min_read_cov_cutoff=3
[2023-05-16 03:16:14] DEBUG: short_tip_length=20000
[2023-05-16 03:16:14] DEBUG: long_tip_length=100000
[2023-05-16 03:16:14] DEBUG: Running with k-mer size: 17
[2023-05-16 03:16:14] DEBUG: Selected minimum overlap 4000
[2023-05-16 03:16:14] INFO: Reading sequences
[2023-05-16 03:16:17] DEBUG: Building positional index
[2023-05-16 03:16:17] DEBUG: Total sequence: 278879070 bp
[2023-05-16 03:16:17] DEBUG: Flipped 0
[2023-05-16 03:16:17] DEBUG: UPath 1: -29 -> 18 -> 19 -> -26 -> 10 -> 23 -> -13 -> 22 -> -11 -> 25 -> 6 -> -28 -> 7 -> 30 -> 8 -> -24 -> 9 -> 27 -> 5
[2023-05-16 03:16:17] DEBUG: UPath 2: 20
[2023-05-16 03:16:17] DEBUG: UPath 3: 21
[2023-05-16 03:16:17] DEBUG: Final graph contains 3 egdes
[2023-05-16 03:16:17] DEBUG: Extending contigs into repeats
[2023-05-16 03:16:17] DEBUG: Covered 0 repetitive contigs
[2023-05-16 03:16:17] INFO: Generated 3 contigs
[2023-05-16 03:16:17] DEBUG: Writing FASTA
[2023-05-16 03:16:17] DEBUG: Generating scaffold connections
[2023-05-16 03:16:17] INFO: Added 0 scaffold connections
[2023-05-16 03:16:17] DEBUG: Writing Dot
[2023-05-16 03:16:17] DEBUG: Writing FASTA
[2023-05-16 03:16:17] DEBUG: Writing Gfa
[2023-05-16 03:16:17] DEBUG: Peak RAM usage: 0 Gb
-----------End assembly log------------
[2023-05-16 03:16:17] root: INFO: >>>STAGE: polishing
[2023-05-16 03:16:17] root: INFO: Polishing genome (1/1)
[2023-05-16 03:16:17] root: INFO: Running minimap2
[2023-05-16 03:16:37] root: INFO: Separating alignment into bubbles
[2023-05-16 03:17:58] root: DEBUG: Generated 328466 bubbles
[2023-05-16 03:17:58] root: DEBUG: Split 5 long bubbles
[2023-05-16 03:17:58] root: DEBUG: Skipped 1 empty bubbles
[2023-05-16 03:17:58] root: DEBUG: Skipped 1 bubbles with long branches
[2023-05-16 03:17:58] root: INFO: Alignment error rate: 0.064162
[2023-05-16 03:17:58] root: INFO: Correcting bubbles
[2023-05-16 03:18:35] root: DEBUG: Mean contig coverage: 56, selected threshold: 11
[2023-05-16 03:18:35] root: DEBUG: Filtered 0 contigs of total length 0
[2023-05-16 03:18:35] root: DEBUG: Generating polished GFA
[2023-05-16 03:18:38] root: DEBUG: 0 sequences remained unpolished
[2023-05-16 03:18:38] root: INFO: >>>STAGE: finalize
[2023-05-16 03:18:38] root: DEBUG: ---Output dir contents:----
[2023-05-16 03:18:38] root: DEBUG: Citrobacter_koseri_MINF/
[2023-05-16 03:18:38] root: DEBUG: 1.0 K Citrobacter_koseri_MINF_05162023_030414.log
[2023-05-16 03:18:38] root: DEBUG: 4.0 M assembly.fasta
[2023-05-16 03:18:38] root: DEBUG: 487.0 B assembly_graph.gv
[2023-05-16 03:18:38] root: DEBUG: 92.0 B params.json
[2023-05-16 03:18:38] root: DEBUG: 4.0 M assembly_graph.gfa
[2023-05-16 03:18:38] root: DEBUG: 36.0 K flye.log
[2023-05-16 03:18:38] root: DEBUG: 279.0 M final_filtered_long_reads.fastq.gz
[2023-05-16 03:18:38] root: DEBUG: 279.0 M chopper_long_reads.fastq.gz
[2023-05-16 03:18:38] root: DEBUG: 40-polishing/
[2023-05-16 03:18:38] root: DEBUG: 4.0 M polished_edges.gfa
[2023-05-16 03:18:38] root: DEBUG: 84.0 B filtered_contigs.fasta.fai
[2023-05-16 03:18:38] root: DEBUG: 1.0 K minimap.stderr
[2023-05-16 03:18:38] root: DEBUG: 84.0 B contigs_stats.txt
[2023-05-16 03:18:38] root: DEBUG: 2.0 K edges_aln.bam.bai
[2023-05-16 03:18:38] root: DEBUG: 1.0 M base_coverage.bed.gz
[2023-05-16 03:18:38] root: DEBUG: 84.0 B filtered_stats.txt
[2023-05-16 03:18:38] root: DEBUG: 120.0 K minimap_1.bam.bai
[2023-05-16 03:18:38] root: DEBUG: 4.0 M filtered_contigs.fasta
[2023-05-16 03:18:38] root: DEBUG: 30-contigger/
[2023-05-16 03:18:38] root: DEBUG: 0.0 B scaffolds_links.txt
[2023-05-16 03:18:38] root: DEBUG: 487.0 B graph_final.gv
[2023-05-16 03:18:38] root: DEBUG: 180.0 B contigs_stats.txt
[2023-05-16 03:18:38] root: DEBUG: 4.0 M graph_final.fasta
[2023-05-16 03:18:38] root: DEBUG: 4.0 M graph_final.gfa
[2023-05-16 03:18:38] root: DEBUG: 84.0 B contigs.fasta.fai
[2023-05-16 03:18:38] root: DEBUG: 4.0 M contigs.fasta
[2023-05-16 03:18:38] root: DEBUG: 10-consensus/
[2023-05-16 03:18:38] root: DEBUG: 1.0 K minimap.stderr
[2023-05-16 03:18:38] root: DEBUG: 128.0 K minimap.bam.bai
[2023-05-16 03:18:38] root: DEBUG: 4.0 M consensus.fasta
[2023-05-16 03:18:38] root: DEBUG: 00-assembly/
[2023-05-16 03:18:38] root: DEBUG: 97.0 B draft_assembly.fasta.fai
[2023-05-16 03:18:38] root: DEBUG: 4.0 M draft_assembly.fasta
[2023-05-16 03:18:38] root: DEBUG: 20-repeat/
[2023-05-16 03:18:38] root: DEBUG: 4.0 K repeat_graph_dump
[2023-05-16 03:18:38] root: DEBUG: 4.0 M repeat_graph_edges.fasta
[2023-05-16 03:18:38] root: DEBUG: 1.0 K graph_before_rr.gv
[2023-05-16 03:18:38] root: DEBUG: 2.0 K graph_after_rr.gv
[2023-05-16 03:18:38] root: DEBUG: 4.0 M graph_before_rr.fasta
[2023-05-16 03:18:38] root: DEBUG: 5.0 M read_alignment_dump
[2023-05-16 03:18:38] root: DEBUG: --------------------------
[2023-05-16 03:18:38] root: INFO: Assembly statistics:
Total length: 4840225
Fragments: 3
Fragments N50: 4757419
Largest frg: 4757419
Scaffolds: 0
Mean coverage: 56
[2023-05-16 03:18:38] root: INFO: Final assembly: /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/assembly.fasta