Flye icon indicating copy to clipboard operation
Flye copied to clipboard

Flye assembly produces differentiating output on the same files.

Open KasperThystrup opened this issue 2 years ago • 11 comments

Hi, Thanks for your great work on enabling de novo assembly on ONT data!

I have been working on a snakemake pipeline for assembling and polishing bacterial data, and have in this regards stumbled upon a strange observation:

The final assembly sizes would sometimes be randomized between duplicate runs.

I did some detective work on the pipeline and came across the output from the Flye assembler.

Here's my findings on performing wc on the assembly.fasta files:


Sample	Newlines	Words	Ncharecters setting	preprocessing
Barcode01	92271	92271	5627747	--nano-raw	Q12-porechop, necat_correction_4iterations
Barcode01	92271	92271	5627741	--nano-raw	Q12-porechop, necat_correction_4iterations
Barcode01	93724	93724	5716532	--nano-raw	Q12-porechop
Barcode01	93724	93724	5716562	--nano-raw	Q12-porechop
Barcode01	92460	92460	5639810	--nano-corr	Q12-porechop, necat_correction_3iterations
Barcode01	92464	92464	5639982	--nano-corr	Q12-porechop, necat_correction_3iterations
Barcode01	92461	92461	5639812	--nano-corr	Q12-porechop, necat_correction_4iterations
Barcode01	92462	92462	5639890	--nano-corr	Q12-porechop, necat_correction_4iterations

Barcode02	100512	100512	6130631	--nano-raw	Q12-porechop, necat_correction_4iterations
Barcode02	100512	100512	6130622	--nano-raw	Q12-porechop, necat_correction_4iterations
Barcode02	100832	100832	6149873	--nano-raw	Q12-porechop
Barcode02	100830	100830	6149840	--nano-raw	Q12-porechop
Barcode02	100839	100839	6150686	--nano-corr	Q12-porechop, necat_correction_3iterations
Barcode02	100676	100676	6140754	--nano-corr	Q12-porechop, necat_correction_3iterations
Barcode02	100568	100568	6134214	--nano-corr	Q12-porechop, necat_correction_4iterations
Barcode02	100582	100582	6135040	--nano-corr	Q12-porechop, necat_correction_4iterations

Barcode03	99624	99624	6076394	--nano-raw	Q12-porechop, necat_correction_4iterations
Barcode03	99624	99624	6076483	--nano-raw	Q12-porechop, necat_correction_4iterations
Barcode03	100240	100240	6114074	--nano-raw	Q12-porechop
Barcode03	100240	100240	6114088	--nano-raw	Q12-porechop
Barcode03	100579	100579	6134668	--nano-corr	Q12-porechop, necat_correction_3iterations
Barcode03	100255	100255	6114959	--nano-corr	Q12-porechop, necat_correction_3iterations
Barcode03	101382	101382	6183694	--nano-corr	Q12-porechop, necat_correction_4iterations
Barcode03	101213	101213	6173408	--nano-corr	Q12-porechop, necat_correction_4iterations

Barcode04	94869	94869	5786459	--nano-raw	Q12-porechop, necat_correction_4iterations
Barcode04	94869	94869	5786459	--nano-raw	Q12-porechop, necat_correction_4iterations
Barcode04	95032	95032	5796730	--nano-raw	Q12-porechop
Barcode04	95032	95032	5796730	--nano-raw	Q12-porechop
Barcode04	94958	94958	5792259	--nano-corr	Q12-porechop, necat_correction_3iterations
Barcode04	94958	94958	5792259	--nano-corr	Q12-porechop, necat_correction_3iterations
Barcode04	95031	95031	5796696	--nano-corr	Q12-porechop, necat_correction_4iterations
Barcode04	95031	95031	5796706	--nano-corr	Q12-porechop, necat_correction_4iterations

Barcode05	98977	98977	6037416	--nano-raw	Q12-porechop, necat_correction_4iterations
Barcode05	98977	98977	6037416	--nano-raw	Q12-porechop, necat_correction_4iterations
Barcode05	98980	98980	6037556	--nano-raw	Q12-porechop
Barcode05	98980	98980	6037532	--nano-raw	Q12-porechop
Barcode05	98775	98775	6025097	--nano-corr	Q12-porechop, necat_correction_3iterations
Barcode05	98775	98775	6025096	--nano-corr	Q12-porechop, necat_correction_3iterations
Barcode05	98979	98979	6037543	--nano-corr	Q12-porechop, necat_correction_4iterations
Barcode05	98979	98979	6037543	--nano-corr	Q12-porechop, necat_correction_4iterations

It seems that files sizes differentiates the most, when using corrected outputs (--nano-corr) however we can also observe a bit of differences from assuming raw nano-reads (--nano-raw) (And yes, I have confirmed that necat-correction did not yield different amount of lines, words, and characters!)

I'm using Flye 2.9 (installed by Snakemake through conda)

KasperThystrup avatar Jun 08 '22 12:06 KasperThystrup

I'm just a random guy who happened to notice this, not a developer, but it is the expected behavior that the assemblies are ever-so-slightly different from run to run on the same files b/c there is at least one step that is not deterministic. Reading the Flye paper and understanding that step will likely help you understand what you're seeing here.

JohnUrban avatar Jun 08 '22 13:06 JohnUrban

@JohnUrban exactly, this is because of indeterminism that is difficult to get rid of.

Also see https://github.com/fenderglass/Flye/issues/298

mikolmogorov avatar Jun 08 '22 14:06 mikolmogorov

@JohnUrban and @fenderglass Thank you both for the clarifications. I have been looking into the 2019 paper, and also been lurking into the referenced issues (also #244).

There is a multiprocessing stage (disjointig generation) where the results slightly dependind on the thread scheduling by OS (e.g. which thread finishes first).

Do I understand things correctly if the randomness is purely associated with multi threading?

If that is the case, then hypothetically, would the output then be identical when running only with a single core?

KasperThystrup avatar Jun 13 '22 07:06 KasperThystrup

@KasperThystrup yes, it is due to multithreading. I think theoretically the output should be identical with a single core. But I have not tested this.

mikolmogorov avatar Jun 14 '22 14:06 mikolmogorov

@KasperThystrup Hey let us know what you find with a single core if you go that route.

JohnUrban avatar Jun 16 '22 17:06 JohnUrban

@fenderglass & @JohnUrban I expect to set up a few experimental runs with single cores later today. I expect to report back sometime next week :-)

KasperThystrup avatar Jun 17 '22 05:06 KasperThystrup

So... Rerunning with only single cores brought our suspecion to light.. Yes bit sizes, line count and character counts are completely identical between two runs.

KasperThystrup avatar Jun 20 '22 12:06 KasperThystrup

This sets up the question: Is it possible and feasible to setup the disjointig part for single core processing, while maintaining the rest of the assembly work in "multiprocessing-mode"??

KasperThystrup avatar Jun 20 '22 12:06 KasperThystrup

@KasperThystrup yes, you can edit line 44 in flye/assembly/assemble.py

mikolmogorov avatar Jun 23 '22 18:06 mikolmogorov

@fenderglass Thanks for the elaboration. Would you be interested to have an implementation of an(other) optional argument to run this step single core only?

Otherwise I will close this issue.

KasperThystrup avatar Jul 01 '22 09:07 KasperThystrup

@KasperThystrup sure, please do a pull request then!

mikolmogorov avatar Jul 08 '22 19:07 mikolmogorov

Hi @fenderglass,

I've been doing some testing of Flye with the same input read sets and different thread counts. It goes without saying Flye is amazing and thanks for making & maintaining it!

After learning of and applying --deterministic as a flag, I assumed the output would be deterministic, but it wasn't.

Based on what I can see, the first sign of non-determinism occurs at the Initial divergence estimate step (I've pasted the log files below).

I'm going to guess that rand() in line 764 [here] (https://github.com/fenderglass/Flye/blob/e156df433cc5d66bfc3663db716d8c1abc1969b6/src/sequence/overlap.cpp#LL764C4-L764C4) is the source of the non-determinism, but I don't know C++.

Just putting this out there for anyone who has come across this :)

Edit

There are very small differences in the Initial divergence estimate step even when I run the same command twice with 1 thread as well.

George

1 Thread

[2023-05-16 01:58:53] root: INFO: Starting Flye 2.9.2-b1786
[2023-05-16 01:58:53] root: DEBUG: Cmd: /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/bin/flye --nano-raw ../real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/final_filtered_long_reads.fastq.gz --out-dir ../real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF --threads 1 --deterministic
[2023-05-16 01:58:53] root: DEBUG: Python version: 3.7.8 | packaged by conda-forge | (default, Jul 31 2020, 02:25:08) 
[GCC 7.5.0]
[2023-05-16 01:58:53] root: INFO: >>>STAGE: configure
[2023-05-16 01:58:53] root: INFO: Configuring run
[2023-05-16 01:58:58] root: INFO: Total read length: 278879070
[2023-05-16 01:58:58] root: INFO: Reads N50/N90: 16438 / 4092
[2023-05-16 01:58:58] root: INFO: Minimum overlap set to 4000
[2023-05-16 01:58:58] root: INFO: >>>STAGE: assembly
[2023-05-16 01:58:58] root: INFO: Assembling disjointigs
[2023-05-16 01:58:58] root: DEBUG: -----Begin assembly log------
[2023-05-16 01:58:58] root: DEBUG: Running: flye-modules assemble --reads /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/final_filtered_long_reads.fastq.gz --out-asm /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/00-assembly/draft_assembly.fasta --config /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg --log /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/flye.log --threads 1 --min-ovlp 4000
[2023-05-16 01:58:58] DEBUG: Build date: May 13 2023 06:30:35
[2023-05-16 01:58:58] DEBUG: Total RAM: 62 Gb
[2023-05-16 01:58:58] DEBUG: Available RAM: 28 Gb
[2023-05-16 01:58:58] DEBUG: Total CPUs: 16
[2023-05-16 01:58:58] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg
[2023-05-16 01:58:58] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_defaults.cfg
[2023-05-16 01:58:58] DEBUG: 	big_genome_threshold=29000000
[2023-05-16 01:58:58] DEBUG: 	meta_read_filter_kmer_freq=100
[2023-05-16 01:58:58] DEBUG: 	chain_large_gap_penalty=2
[2023-05-16 01:58:58] DEBUG: 	chain_small_gap_penalty=0.5
[2023-05-16 01:58:58] DEBUG: 	chain_gap_jump_threshold=100
[2023-05-16 01:58:58] DEBUG: 	max_coverage_drop_rate=5
[2023-05-16 01:58:58] DEBUG: 	max_extensions_drop_rate=5
[2023-05-16 01:58:58] DEBUG: 	chimera_window=100
[2023-05-16 01:58:58] DEBUG: 	chimera_overhang=1000
[2023-05-16 01:58:58] DEBUG: 	min_reads_in_disjointig=4
[2023-05-16 01:58:58] DEBUG: 	max_inner_reads=10
[2023-05-16 01:58:58] DEBUG: 	max_inner_fraction=0.25
[2023-05-16 01:58:58] DEBUG: 	max_separation=500
[2023-05-16 01:58:58] DEBUG: 	unique_edge_length=50000
[2023-05-16 01:58:58] DEBUG: 	min_repeat_res_support=0.51
[2023-05-16 01:58:58] DEBUG: 	out_paths_ratio=5
[2023-05-16 01:58:58] DEBUG: 	graph_cov_drop_rate=5
[2023-05-16 01:58:58] DEBUG: 	coverage_estimate_window=100
[2023-05-16 01:58:58] DEBUG: 	max_bubble_length=50000
[2023-05-16 01:58:58] DEBUG: 	loop_coverage_rate=1.5
[2023-05-16 01:58:58] DEBUG: 	repeat_edge_cov_mult=1.75
[2023-05-16 01:58:58] DEBUG: 	weak_detach_rate=5
[2023-05-16 01:58:58] DEBUG: 	tip_coverage_rate=2
[2023-05-16 01:58:58] DEBUG: 	tip_length_rate=2
[2023-05-16 01:58:58] DEBUG: 	output_gfa_before_rr=0
[2023-05-16 01:58:58] DEBUG: 	remove_alt_edges=0
[2023-05-16 01:58:58] DEBUG: 	low_cutoff_warning=1
[2023-05-16 01:58:58] DEBUG: 	kmer_size=17
[2023-05-16 01:58:58] DEBUG: 	use_minimizers=0
[2023-05-16 01:58:58] DEBUG: 	reads_base_alignment=0
[2023-05-16 01:58:58] DEBUG: 	meta_read_top_kmer_rate=0.40
[2023-05-16 01:58:58] DEBUG: 	maximum_jump=1500
[2023-05-16 01:58:58] DEBUG: 	maximum_overhang=1500
[2023-05-16 01:58:58] DEBUG: 	repeat_kmer_rate=100
[2023-05-16 01:58:58] DEBUG: 	assemble_ovlp_divergence=0.10
[2023-05-16 01:58:58] DEBUG: 	assemble_divergence_relative=1
[2023-05-16 01:58:58] DEBUG: 	repeat_graph_ovlp_divergence=0.08
[2023-05-16 01:58:58] DEBUG: 	read_align_ovlp_divergence=0.25
[2023-05-16 01:58:58] DEBUG: 	hpc_scoring_on=0
[2023-05-16 01:58:58] DEBUG: 	add_unassembled_reads=0
[2023-05-16 01:58:58] DEBUG: 	extend_contigs_with_repeats=0
[2023-05-16 01:58:58] DEBUG: 	min_read_cov_cutoff=3
[2023-05-16 01:58:58] DEBUG: 	short_tip_length=20000
[2023-05-16 01:58:58] DEBUG: 	long_tip_length=100000
[2023-05-16 01:58:58] DEBUG: Running with k-mer size: 17
[2023-05-16 01:58:58] DEBUG: Running with minimum overlap 4000
[2023-05-16 01:58:58] DEBUG: Metagenome mode: N
[2023-05-16 01:58:58] DEBUG: Short mode: N
[2023-05-16 01:58:58] INFO: Reading sequences
[2023-05-16 01:59:02] DEBUG: Building positional index
[2023-05-16 01:59:02] DEBUG: Total sequence: 251729388 bp
[2023-05-16 01:59:04] INFO: Counting k-mers:
[2023-05-16 02:00:10] DEBUG: Updating k-mer histogram
[2023-05-16 02:01:05] DEBUG: Hash size: 4622497
[2023-05-16 02:01:05] DEBUG: Total k-mers 88307420
[2023-05-16 02:01:05] INFO: Filling index table (1/2)
[2023-05-16 02:04:03] DEBUG: Mean k-mer frequency: 21.8363
[2023-05-16 02:04:03] DEBUG: Repetitive k-mer frequency: 2183
[2023-05-16 02:04:03] DEBUG: Filtered 18624 repetitive k-mers (0.000182569)
[2023-05-16 02:04:04] INFO: Filling index table (2/2)
[2023-05-16 02:08:04] DEBUG: Sorting k-mer index
[2023-05-16 02:08:05] DEBUG: Selected k-mers: 6031079
[2023-05-16 02:08:05] DEBUG: Index size: 103351604
[2023-05-16 02:08:05] DEBUG: Mean k-mer index frequency: 17.1365
[2023-05-16 02:08:05] DEBUG: Peak RAM usage: 8 Gb
[2023-05-16 02:08:05] DEBUG: Estimating k-mer identity bias
[2023-05-16 02:09:10] DEBUG: Initial divergence estimate : 0.0723839
[2023-05-16 02:09:10] DEBUG: Relative threshold: Y
[2023-05-16 02:09:10] DEBUG: Max divergence threshold set to 0.172384
[2023-05-16 02:09:10] INFO: Extending reads
[2023-05-16 02:09:10] DEBUG: Estimating overlap coverage
[2023-05-16 02:10:06] INFO: Overlap-based coverage: 48
[2023-05-16 02:10:06] INFO: Median overlap divergence: 0.0722228
[2023-05-16 02:10:06] DEBUG: Sequence divergence distribution: 

    |             *                    |                                                                 
    |             *                    |                                                                 
    |             *                    |                                                                 
    |             **                   |                                                                 
    |             **                   |                                                                 
    |             **                   |                                                                 
    |            ***                   |                                                                 
    |            ****                  |                                                                 
    |            ****                  |                                                                 
    |            *****                 |                                                                 
    |            *****                 |                                                                 
    |            *****                 |                                                                 
    |           *******                |                                                                 
    |           *******                |                                                                 
    |          ********                |                                                                 
    |          *********               |                                                                 
    |          **********              |                                                                 
    |         *************            |                                                                 
    |        ***************           |                                                                 
    |      *************************** ****  ********** * *** * **                                       
    ----------------------------------------------------------------------------------------------------
    0%        5%        10%       15%       20%       25%       30%       35%       40%       45%       

    Q25 = 0.065, Q50 = 0.072, Q75 = 0.084

[2023-05-16 02:11:18] DEBUG: Assembled disjointig 1
	With 362 reads
	Start read: +d060018a-55a4-4b68-8293-21d683cfdd94
	At position: 360
	leftTip: 0 rightTip: 0
	Suspicious: 2
	Short ext: 2
	Mean extensions: 39
	Avg overlap len: 38689
	Min overlap len: 1451
	Inner reads: 0
	Length: 4810091
[2023-05-16 02:11:18] DEBUG: Inner: 32900 covered: 32972 total: 35900
[2023-05-16 02:11:27] DEBUG: Assembled disjointig 2
	With 16 reads
	Start read: +7f476f63-9362-4273-aa0a-f6654120414b
	At position: 5
	leftTip: 1 rightTip: 0
	Suspicious: 0
	Short ext: 0
	Mean extensions: 61
	Avg overlap len: 56938
	Min overlap len: 6090
	Inner reads: 0
	Length: 127874
[2023-05-16 02:11:27] DEBUG: Inner: 33808 covered: 33888 total: 35900
[2023-05-16 02:11:29] DEBUG: Assembled disjointig 3
	With 19 reads
	Start read: +aab73044-040d-495f-b66e-700e2731c431
	At position: 6
	leftTip: 0 rightTip: 0
	Suspicious: 0
	Short ext: 0
	Mean extensions: 68
	Avg overlap len: 7518
	Min overlap len: 4400
	Inner reads: 0
	Length: 40601
[2023-05-16 02:11:29] DEBUG: Inner: 34308 covered: 34388 total: 35900
[2023-05-16 02:11:53] INFO: Assembled 3 disjointigs
[2023-05-16 02:11:54] INFO: Generating sequence
[2023-05-16 02:12:02] DEBUG: Building positional index
[2023-05-16 02:12:02] DEBUG: Total sequence: 4978787 bp
[2023-05-16 02:12:05] DEBUG: Mean k-mer frequency: 1.03604
[2023-05-16 02:12:05] DEBUG: Repetitive k-mer frequency: 103
[2023-05-16 02:12:05] DEBUG: Filtered 0 repetitive k-mers (0)
[2023-05-16 02:12:07] DEBUG: Sorting k-mer index
[2023-05-16 02:12:07] DEBUG: Selected k-mers: 4805532
[2023-05-16 02:12:07] DEBUG: K-mer index size: 4978736
[2023-05-16 02:12:07] DEBUG: Mean k-mer frequency: 1.03604
[2023-05-16 02:12:07] DEBUG: Minimizer rate: 1.00001
[2023-05-16 02:12:07] INFO: Filtering contained disjointigs
[2023-05-16 02:12:11] DEBUG: Computing transitive closure for overlaps
[2023-05-16 02:12:11] DEBUG: Found 12 overlaps
[2023-05-16 02:12:11] DEBUG: Left 12 overlaps after filtering
[2023-05-16 02:12:11] INFO: Contained seqs: 0
[2023-05-16 02:12:11] DEBUG: Writing FASTA
[2023-05-16 02:12:11] DEBUG: Peak RAM usage: 8 Gb
-----------End assembly log------------
[2023-05-16 02:12:12] root: DEBUG: Disjointigs length: 4978787, N50: 4809821
[2023-05-16 02:12:12] root: INFO: >>>STAGE: consensus
[2023-05-16 02:12:12] root: INFO: Running Minimap2
[2023-05-16 02:14:00] root: INFO: Computing consensus
[2023-05-16 02:17:19] root: INFO: Alignment error rate: 0.105022
[2023-05-16 02:17:19] root: INFO: >>>STAGE: repeat
[2023-05-16 02:17:19] root: INFO: Building and resolving repeat graph
[2023-05-16 02:17:19] root: DEBUG: -----Begin repeat analyser log------
[2023-05-16 02:17:19] root: DEBUG: Running: flye-modules repeat --disjointigs /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/10-consensus/consensus.fasta --reads /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/final_filtered_long_reads.fastq.gz --out-dir /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/20-repeat --config /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg --log /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/flye.log --threads 1 --min-ovlp 4000
[2023-05-16 02:17:19] DEBUG: Build date: May 13 2023 06:30:59
[2023-05-16 02:17:19] DEBUG: Total RAM: 62 Gb
[2023-05-16 02:17:19] DEBUG: Available RAM: 56 Gb
[2023-05-16 02:17:19] DEBUG: Total CPUs: 16
[2023-05-16 02:17:19] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg
[2023-05-16 02:17:19] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_defaults.cfg
[2023-05-16 02:17:19] DEBUG: 	big_genome_threshold=29000000
[2023-05-16 02:17:19] DEBUG: 	meta_read_filter_kmer_freq=100
[2023-05-16 02:17:19] DEBUG: 	chain_large_gap_penalty=2
[2023-05-16 02:17:19] DEBUG: 	chain_small_gap_penalty=0.5
[2023-05-16 02:17:19] DEBUG: 	chain_gap_jump_threshold=100
[2023-05-16 02:17:19] DEBUG: 	max_coverage_drop_rate=5
[2023-05-16 02:17:19] DEBUG: 	max_extensions_drop_rate=5
[2023-05-16 02:17:19] DEBUG: 	chimera_window=100
[2023-05-16 02:17:19] DEBUG: 	chimera_overhang=1000
[2023-05-16 02:17:19] DEBUG: 	min_reads_in_disjointig=4
[2023-05-16 02:17:19] DEBUG: 	max_inner_reads=10
[2023-05-16 02:17:19] DEBUG: 	max_inner_fraction=0.25
[2023-05-16 02:17:19] DEBUG: 	max_separation=500
[2023-05-16 02:17:19] DEBUG: 	unique_edge_length=50000
[2023-05-16 02:17:19] DEBUG: 	min_repeat_res_support=0.51
[2023-05-16 02:17:19] DEBUG: 	out_paths_ratio=5
[2023-05-16 02:17:19] DEBUG: 	graph_cov_drop_rate=5
[2023-05-16 02:17:19] DEBUG: 	coverage_estimate_window=100
[2023-05-16 02:17:19] DEBUG: 	max_bubble_length=50000
[2023-05-16 02:17:19] DEBUG: 	loop_coverage_rate=1.5
[2023-05-16 02:17:19] DEBUG: 	repeat_edge_cov_mult=1.75
[2023-05-16 02:17:19] DEBUG: 	weak_detach_rate=5
[2023-05-16 02:17:19] DEBUG: 	tip_coverage_rate=2
[2023-05-16 02:17:19] DEBUG: 	tip_length_rate=2
[2023-05-16 02:17:19] DEBUG: 	output_gfa_before_rr=0
[2023-05-16 02:17:19] DEBUG: 	remove_alt_edges=0
[2023-05-16 02:17:19] DEBUG: 	low_cutoff_warning=1
[2023-05-16 02:17:19] DEBUG: 	kmer_size=17
[2023-05-16 02:17:19] DEBUG: 	use_minimizers=0
[2023-05-16 02:17:19] DEBUG: 	reads_base_alignment=0
[2023-05-16 02:17:19] DEBUG: 	meta_read_top_kmer_rate=0.40
[2023-05-16 02:17:19] DEBUG: 	maximum_jump=1500
[2023-05-16 02:17:19] DEBUG: 	maximum_overhang=1500
[2023-05-16 02:17:19] DEBUG: 	repeat_kmer_rate=100
[2023-05-16 02:17:19] DEBUG: 	assemble_ovlp_divergence=0.10
[2023-05-16 02:17:19] DEBUG: 	assemble_divergence_relative=1
[2023-05-16 02:17:19] DEBUG: 	repeat_graph_ovlp_divergence=0.08
[2023-05-16 02:17:19] DEBUG: 	read_align_ovlp_divergence=0.25
[2023-05-16 02:17:19] DEBUG: 	hpc_scoring_on=0
[2023-05-16 02:17:19] DEBUG: 	add_unassembled_reads=0
[2023-05-16 02:17:19] DEBUG: 	extend_contigs_with_repeats=0
[2023-05-16 02:17:19] DEBUG: 	min_read_cov_cutoff=3
[2023-05-16 02:17:19] DEBUG: 	short_tip_length=20000
[2023-05-16 02:17:19] DEBUG: 	long_tip_length=100000
[2023-05-16 02:17:19] DEBUG: Running with k-mer size: 17
[2023-05-16 02:17:19] DEBUG: Selected minimum overlap 4000
[2023-05-16 02:17:19] DEBUG: Metagenome mode: N
[2023-05-16 02:17:19] INFO: Parsing disjointigs
[2023-05-16 02:17:19] DEBUG: Building positional index
[2023-05-16 02:17:19] DEBUG: Total sequence: 5027082 bp
[2023-05-16 02:17:19] INFO: Building repeat graph
[2023-05-16 02:17:22] DEBUG: Mean k-mer frequency: 1.05981
[2023-05-16 02:17:22] DEBUG: Repetitive k-mer frequency: 105
[2023-05-16 02:17:22] DEBUG: Filtered 271 repetitive k-mers (5.39086e-05)
[2023-05-16 02:17:24] DEBUG: Sorting k-mer index
[2023-05-16 02:17:24] DEBUG: Selected k-mers: 4743325
[2023-05-16 02:17:24] DEBUG: K-mer index size: 5026760
[2023-05-16 02:17:24] DEBUG: Mean k-mer frequency: 1.05975
[2023-05-16 02:17:24] DEBUG: Minimizer rate: 1.00006
[2023-05-16 02:17:28] DEBUG: Computing transitive closure for overlaps
[2023-05-16 02:17:28] DEBUG: Found 240 overlaps
[2023-05-16 02:17:28] DEBUG: Left 100 overlaps after filtering
[2023-05-16 02:17:28] INFO: Median overlap divergence: 0.0245107
[2023-05-16 02:17:28] DEBUG: Sequence divergence distribution: 

    |*               |                                                                                   
    |*               |                                                                                   
    |*               |                                                                                   
    |*               |                                                                                   
    |*               |                                                                                   
    |*         *     |                                                                                   
    |*         *     |                                                                                   
    |*         *     |                                                                                   
    |*         *     |                                                                                   
    |*         *     |                                                                                   
    |**  *  *  *     |                             *                                                     
    |**  *  *  *     |                             *                                                     
    |**  *  *  *     |                             *                                                     
    |**  *  *  *     |                             *                                                     
    |**  *  *  *     |                             *                                                     
    |** **  * **     |                             *                                                     
    |** **  * **     |                             *                                                     
    |** **  * **     |                             *                                                     
    |** **  * **     |                             *                                                     
    |** **  * **     |                             *                                                     
    ----------------------------------------------------------------------------------------------------
    0%        5%        10%       15%       20%       25%       30%       35%       40%       45%       

    Q25 = 0.0098, Q50 = 0.025, Q75 = 0.052

[2023-05-16 02:17:28] DEBUG: Computing gluepoints
[2023-05-16 02:17:28] DEBUG: Added 0 gluepoint projections
[2023-05-16 02:17:28] DEBUG: Created 62 gluepoints
[2023-05-16 02:17:28] DEBUG: Artificial loops removed: 0 left, 0 right, 0 both
[2023-05-16 02:17:28] DEBUG: Initializing edges
[2023-05-16 02:17:28] DEBUG: Edges length checksum: 18446744071951779876
[2023-05-16 02:17:28] DEBUG: Filtered 0 singleton segments
[2023-05-16 02:17:28] DEBUG: Removed 0 simple and 0 double chimeric junctions
[2023-05-16 02:17:28] DEBUG: Collapsed 2 edges
[2023-05-16 02:17:28] DEBUG: *	18	+disjointig_1	105063	513261	408198	
[2023-05-16 02:17:28] DEBUG:  	4	+disjointig_1	513261	518393	5132	
[2023-05-16 02:17:28] DEBUG: *	-8	+disjointig_1	518393	551279	32886	
[2023-05-16 02:17:28] DEBUG:  	4	+disjointig_1	551279	556589	5310	
[2023-05-16 02:17:28] DEBUG: *	-7	+disjointig_1	556589	681739	125150	
[2023-05-16 02:17:28] DEBUG:  	4	+disjointig_1	681739	687055	5316	
[2023-05-16 02:17:28] DEBUG: *	-6	+disjointig_1	687055	771757	84702	
[2023-05-16 02:17:28] DEBUG:  	4	+disjointig_1	771757	776907	5150	
[2023-05-16 02:17:28] DEBUG: *	11	+disjointig_1	776907	1262839	485932	
[2023-05-16 02:17:28] DEBUG:  	-4	+disjointig_1	1262839	1268131	5292	
[2023-05-16 02:17:28] DEBUG: *	5	+disjointig_1	1268131	1942573	674442	
[2023-05-16 02:17:28] DEBUG:  	-4	+disjointig_1	1942573	1947985	5412	
[2023-05-16 02:17:28] DEBUG: *	9	+disjointig_1	1947985	2685042	737057	
[2023-05-16 02:17:28] DEBUG:  	12	+disjointig_1	2685042	2713831	28789	
[2023-05-16 02:17:28] DEBUG: *	-13	+disjointig_1	2713831	2991198	277367	
[2023-05-16 02:17:28] DEBUG:  	12	+disjointig_1	2991198	3019982	28784	
[2023-05-16 02:17:28] DEBUG: *	-10	+disjointig_1	3019982	4550446	1530464	
[2023-05-16 02:17:28] DEBUG:  	4	+disjointig_1	4550446	4555843	5397	
[2023-05-16 02:17:28] DEBUG: *	17	+disjointig_1	4555843	4856779	300936	
[2023-05-16 02:17:28] DEBUG:  	14	+disjointig_2	0	129638	129638	
[2023-05-16 02:17:28] DEBUG:  	19	+disjointig_3	3683	40645	36962	
[2023-05-16 02:17:28] DEBUG: Total edges: 19
[2023-05-16 02:17:28] INFO: Parsing reads
[2023-05-16 02:17:31] DEBUG: Building positional index
[2023-05-16 02:17:31] DEBUG: Total sequence: 278879070 bp
[2023-05-16 02:17:31] DEBUG: Building positional index
[2023-05-16 02:17:31] DEBUG: Total sequence: 4918316 bp
[2023-05-16 02:17:31] INFO: Aligning reads to the graph
[2023-05-16 02:17:34] DEBUG: Mean k-mer frequency: 1.03885
[2023-05-16 02:17:34] DEBUG: Repetitive k-mer frequency: 103
[2023-05-16 02:17:34] DEBUG: Filtered 262 repetitive k-mers (5.32745e-05)
[2023-05-16 02:17:36] DEBUG: Sorting k-mer index
[2023-05-16 02:17:36] DEBUG: Selected k-mers: 4733997
[2023-05-16 02:17:36] DEBUG: K-mer index size: 4917663
[2023-05-16 02:17:36] DEBUG: Mean k-mer frequency: 1.0388
[2023-05-16 02:17:36] DEBUG: Minimizer rate: 1.00013
[2023-05-16 02:20:31] DEBUG: Total reads : 17950
[2023-05-16 02:20:31] DEBUG: Read with aligned parts : 17458
[2023-05-16 02:20:31] DEBUG: Aligned in one piece : 17383
[2023-05-16 02:20:31] INFO: Aligned read sequence: 243817236 / 251729388 (0.968569)
[2023-05-16 02:20:31] INFO: Median overlap divergence: 0.0275226
[2023-05-16 02:20:31] DEBUG: Sequence divergence distribution: 

    |    *                                             |                                                 
    |    *                                             |                                                 
    |    *                                             |                                                 
    |    *                                             |                                                 
    |    **                                            |                                                 
    |    **                                            |                                                 
    |    **                                            |                                                 
    |    **                                            |                                                 
    |   ***                                            |                                                 
    |   ****                                           |                                                 
    |   ****                                           |                                                 
    |   ****                                           |                                                 
    |   *****                                          |                                                 
    |   *****                                          |                                                 
    |   *****                                          |                                                 
    |   ******                                         |                                                 
    |   *******                                        |                                                 
    |   ********                                       |                                                 
    |   **********                                     |                                                 
    |  ******************** *********************************  *    *                                    
    ----------------------------------------------------------------------------------------------------
    0%        5%        10%       15%       20%       25%       30%       35%       40%       45%       

    Q25 = 0.022, Q50 = 0.028, Q75 = 0.037

[2023-05-16 02:20:31] INFO: Mean edge coverage: 51
[2023-05-16 02:20:31] DEBUG: 4	len:5287	cov:398	mult:7.80392
[2023-05-16 02:20:31] DEBUG: -4	len:5287	cov:398	mult:7.80392
[2023-05-16 02:20:31] DEBUG: 5	len:674442	cov:50	mult:0.980392
[2023-05-16 02:20:31] DEBUG: -5	len:674442	cov:50	mult:0.980392
[2023-05-16 02:20:31] DEBUG: 6	len:84702	cov:51	mult:1
[2023-05-16 02:20:31] DEBUG: -6	len:84702	cov:51	mult:1
[2023-05-16 02:20:31] DEBUG: 7	len:125150	cov:55	mult:1.07843
[2023-05-16 02:20:31] DEBUG: -7	len:125150	cov:55	mult:1.07843
[2023-05-16 02:20:31] DEBUG: 8	len:32886	cov:60	mult:1.17647
[2023-05-16 02:20:31] DEBUG: -8	len:32886	cov:60	mult:1.17647
[2023-05-16 02:20:31] DEBUG: 9	len:737057	cov:48	mult:0.941176
[2023-05-16 02:20:31] DEBUG: -9	len:737057	cov:48	mult:0.941176
[2023-05-16 02:20:31] DEBUG: 10	len:1530464	cov:47	mult:0.921569
[2023-05-16 02:20:31] DEBUG: -10	len:1530464	cov:47	mult:0.921569
[2023-05-16 02:20:31] DEBUG: 11	len:485932	cov:56	mult:1.09804
[2023-05-16 02:20:31] DEBUG: -11	len:485932	cov:56	mult:1.09804
[2023-05-16 02:20:31] DEBUG: 12	len:28786	cov:101	mult:1.98039
[2023-05-16 02:20:31] DEBUG: -12	len:28786	cov:101	mult:1.98039
[2023-05-16 02:20:31] DEBUG: 13	len:277367	cov:46	mult:0.901961
[2023-05-16 02:20:31] DEBUG: -13	len:277367	cov:46	mult:0.901961
[2023-05-16 02:20:31] DEBUG: 14	len:64819	cov:113	mult:2.21569
[2023-05-16 02:20:31] DEBUG: -14	len:64819	cov:113	mult:2.21569
[2023-05-16 02:20:31] DEBUG: 17	len:300936	cov:49	mult:0.960784
[2023-05-16 02:20:31] DEBUG: -17	len:300936	cov:49	mult:0.960784
[2023-05-16 02:20:31] DEBUG: 18	len:408198	cov:49	mult:0.960784
[2023-05-16 02:20:31] DEBUG: -18	len:408198	cov:49	mult:0.960784
[2023-05-16 02:20:31] DEBUG: 19	len:18481	cov:110	mult:2.15686
[2023-05-16 02:20:31] DEBUG: -19	len:18481	cov:110	mult:2.15686
[2023-05-16 02:20:31] DEBUG: Unique coverage threshold 96
[2023-05-16 02:20:31] INFO: Simplifying the graph
[2023-05-16 02:20:31] DEBUG: Read coverage cutoff: 10
[2023-05-16 02:20:31] DEBUG: [SIMPL] Removed 0 paths with low coverage
[2023-05-16 02:20:31] DEBUG: [SIMPL] Masked 0 heterozygous loops
[2023-05-16 02:20:31] DEBUG: [SIMPL] Masked 0 simple bubbles
[2023-05-16 02:20:31] DEBUG: Finding repeats
[2023-05-16 02:20:31] DEBUG: Read coverage cutoff: 10
[2023-05-16 02:20:31] DEBUG: High-cov: 4	5287	398
[2023-05-16 02:20:31] DEBUG: High-cov: 12	28786	101
[2023-05-16 02:20:31] DEBUG: High-cov: 14	64819	113
[2023-05-16 02:20:31] DEBUG: High-cov: 19	18481	110
[2023-05-16 02:20:31] DEBUG: Repeat detection iteration 1
[2023-05-16 02:20:31] DEBUG: Writing Dot
[2023-05-16 02:20:32] DEBUG: Writing FASTA
[2023-05-16 02:20:32] DEBUG: [SIMPL] == Iteration 1 ==
[2023-05-16 02:20:32] DEBUG: Splitting nodes
[2023-05-16 02:20:32] DEBUG: [SIMPL] Split 0 nodes
[2023-05-16 02:20:32] DEBUG: [SIMPL] Clipped 0 short and 0 long tips
[2023-05-16 02:20:32] DEBUG: [SIMPL] Masked 0 heterozygous loops
[2023-05-16 02:20:32] DEBUG: [SIMPL] Masked 0 simple bubbles
[2023-05-16 02:20:32] DEBUG: Finding repeats
[2023-05-16 02:20:32] DEBUG: Read coverage cutoff: 10
[2023-05-16 02:20:32] DEBUG: High-cov: 4	5287	398
[2023-05-16 02:20:32] DEBUG: High-cov: 12	28786	101
[2023-05-16 02:20:32] DEBUG: High-cov: 14	64819	113
[2023-05-16 02:20:32] DEBUG: High-cov: 19	18481	110
[2023-05-16 02:20:32] DEBUG: Repeat detection iteration 1
[2023-05-16 02:20:32] DEBUG: Total unique edges: 10
[2023-05-16 02:20:32] DEBUG: 	Connection -13	-10	7	1
[2023-05-16 02:20:32] DEBUG: 	Connection 13	-9	3	1
[2023-05-16 02:20:32] DEBUG: 	Connection -7	-6	33	1
[2023-05-16 02:20:32] DEBUG: 	Connection -10	17	29	1
[2023-05-16 02:20:32] DEBUG: 	Connection 5	9	25	1
[2023-05-16 02:20:32] DEBUG: 	Connection -6	11	23	1
[2023-05-16 02:20:32] DEBUG: 	Connection 8	-18	31	1
[2023-05-16 02:20:32] DEBUG: 	Connection -5	-11	29	1
[2023-05-16 02:20:32] DEBUG: 	Connection 7	8	37	1
[2023-05-16 02:20:32] DEBUG: [SIMPL] Resolved repeats: 9
[2023-05-16 02:20:32] DEBUG: RR links: 434
[2023-05-16 02:20:32] DEBUG: Unresolved: 0
[2023-05-16 02:20:32] DEBUG: Removed 0 simple and 0 double chimeric junctions
[2023-05-16 02:20:32] DEBUG: [SIMPL] == Iteration 2 ==
[2023-05-16 02:20:32] DEBUG: Splitting nodes
[2023-05-16 02:20:32] DEBUG: [SIMPL] Split 0 nodes
[2023-05-16 02:20:32] DEBUG: [SIMPL] Clipped 0 short and 0 long tips
[2023-05-16 02:20:32] DEBUG: [SIMPL] Masked 0 heterozygous loops
[2023-05-16 02:20:32] DEBUG: [SIMPL] Masked 0 simple bubbles
[2023-05-16 02:20:32] DEBUG: Finding repeats
[2023-05-16 02:20:32] DEBUG: Read coverage cutoff: 10
[2023-05-16 02:20:32] DEBUG: High-cov: 14	64819	113
[2023-05-16 02:20:32] DEBUG: High-cov: 19	18481	110
[2023-05-16 02:20:32] DEBUG: Repeat detection iteration 1
[2023-05-16 02:20:32] DEBUG: Total unique edges: 19
[2023-05-16 02:20:32] DEBUG: [SIMPL] Resolved repeats: 0
[2023-05-16 02:20:32] DEBUG: RR links: 0
[2023-05-16 02:20:32] DEBUG: Unresolved: 0
[2023-05-16 02:20:32] DEBUG: Removed 0 simple and 0 double chimeric junctions
[2023-05-16 02:20:32] DEBUG: [SIMPL] Collapsed 0 haplotypes
[2023-05-16 02:20:32] DEBUG: [SIMPL] Resolved 0 simple repeats
[2023-05-16 02:20:32] DEBUG: Read coverage cutoff: 10
[2023-05-16 02:20:32] DEBUG: [SIMPL] Removed 0 paths with low coverage
[2023-05-16 02:20:32] DEBUG: Finding repeats
[2023-05-16 02:20:32] DEBUG: Read coverage cutoff: 10
[2023-05-16 02:20:32] DEBUG: High-cov: 14	64819	113
[2023-05-16 02:20:32] DEBUG: High-cov: 19	18481	110
[2023-05-16 02:20:32] DEBUG: Repeat detection iteration 1
[2023-05-16 02:20:32] DEBUG: Writing Dot
[2023-05-16 02:20:32] DEBUG: Writing FASTA
[2023-05-16 02:20:32] DEBUG: Peak RAM usage: 0 Gb
-----------End assembly log------------
[2023-05-16 02:20:32] root: INFO: >>>STAGE: contigger
[2023-05-16 02:20:32] root: INFO: Generating contigs
[2023-05-16 02:20:32] root: DEBUG: -----Begin contigger analyser log------
[2023-05-16 02:20:32] root: DEBUG: Running: flye-modules contigger --graph-edges /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/20-repeat/repeat_graph_edges.fasta --reads /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/final_filtered_long_reads.fastq.gz --out-dir /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/30-contigger --config /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg --repeat-graph /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/20-repeat/repeat_graph_dump --graph-aln /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/20-repeat/read_alignment_dump --log /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/flye.log --threads 1 --min-ovlp 4000
[2023-05-16 02:20:32] DEBUG: Build date: May 13 2023 06:31:26
[2023-05-16 02:20:32] DEBUG: Total RAM: 62 Gb
[2023-05-16 02:20:32] DEBUG: Available RAM: 54 Gb
[2023-05-16 02:20:32] DEBUG: Total CPUs: 16
[2023-05-16 02:20:32] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg
[2023-05-16 02:20:32] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_defaults.cfg
[2023-05-16 02:20:32] DEBUG: 	big_genome_threshold=29000000
[2023-05-16 02:20:32] DEBUG: 	meta_read_filter_kmer_freq=100
[2023-05-16 02:20:32] DEBUG: 	chain_large_gap_penalty=2
[2023-05-16 02:20:32] DEBUG: 	chain_small_gap_penalty=0.5
[2023-05-16 02:20:32] DEBUG: 	chain_gap_jump_threshold=100
[2023-05-16 02:20:32] DEBUG: 	max_coverage_drop_rate=5
[2023-05-16 02:20:32] DEBUG: 	max_extensions_drop_rate=5
[2023-05-16 02:20:32] DEBUG: 	chimera_window=100
[2023-05-16 02:20:32] DEBUG: 	chimera_overhang=1000
[2023-05-16 02:20:32] DEBUG: 	min_reads_in_disjointig=4
[2023-05-16 02:20:32] DEBUG: 	max_inner_reads=10
[2023-05-16 02:20:32] DEBUG: 	max_inner_fraction=0.25
[2023-05-16 02:20:32] DEBUG: 	max_separation=500
[2023-05-16 02:20:32] DEBUG: 	unique_edge_length=50000
[2023-05-16 02:20:32] DEBUG: 	min_repeat_res_support=0.51
[2023-05-16 02:20:32] DEBUG: 	out_paths_ratio=5
[2023-05-16 02:20:32] DEBUG: 	graph_cov_drop_rate=5
[2023-05-16 02:20:32] DEBUG: 	coverage_estimate_window=100
[2023-05-16 02:20:32] DEBUG: 	max_bubble_length=50000
[2023-05-16 02:20:32] DEBUG: 	loop_coverage_rate=1.5
[2023-05-16 02:20:32] DEBUG: 	repeat_edge_cov_mult=1.75
[2023-05-16 02:20:32] DEBUG: 	weak_detach_rate=5
[2023-05-16 02:20:32] DEBUG: 	tip_coverage_rate=2
[2023-05-16 02:20:32] DEBUG: 	tip_length_rate=2
[2023-05-16 02:20:32] DEBUG: 	output_gfa_before_rr=0
[2023-05-16 02:20:32] DEBUG: 	remove_alt_edges=0
[2023-05-16 02:20:32] DEBUG: 	low_cutoff_warning=1
[2023-05-16 02:20:32] DEBUG: 	kmer_size=17
[2023-05-16 02:20:32] DEBUG: 	use_minimizers=0
[2023-05-16 02:20:32] DEBUG: 	reads_base_alignment=0
[2023-05-16 02:20:32] DEBUG: 	meta_read_top_kmer_rate=0.40
[2023-05-16 02:20:32] DEBUG: 	maximum_jump=1500
[2023-05-16 02:20:32] DEBUG: 	maximum_overhang=1500
[2023-05-16 02:20:32] DEBUG: 	repeat_kmer_rate=100
[2023-05-16 02:20:32] DEBUG: 	assemble_ovlp_divergence=0.10
[2023-05-16 02:20:32] DEBUG: 	assemble_divergence_relative=1
[2023-05-16 02:20:32] DEBUG: 	repeat_graph_ovlp_divergence=0.08
[2023-05-16 02:20:32] DEBUG: 	read_align_ovlp_divergence=0.25
[2023-05-16 02:20:32] DEBUG: 	hpc_scoring_on=0
[2023-05-16 02:20:32] DEBUG: 	add_unassembled_reads=0
[2023-05-16 02:20:32] DEBUG: 	extend_contigs_with_repeats=0
[2023-05-16 02:20:32] DEBUG: 	min_read_cov_cutoff=3
[2023-05-16 02:20:32] DEBUG: 	short_tip_length=20000
[2023-05-16 02:20:32] DEBUG: 	long_tip_length=100000
[2023-05-16 02:20:32] DEBUG: Running with k-mer size: 17
[2023-05-16 02:20:32] DEBUG: Selected minimum overlap 4000
[2023-05-16 02:20:32] INFO: Reading sequences
[2023-05-16 02:20:37] DEBUG: Building positional index
[2023-05-16 02:20:37] DEBUG: Total sequence: 278879070 bp
[2023-05-16 02:20:37] DEBUG: Flipped 0
[2023-05-16 02:20:37] DEBUG: UPath 1: -24 -> 9 -> 21 -> -13 -> 20 -> -10 -> 23 -> 17 -> 18 -> -26 -> -8 -> -28 -> -7 -> -22 -> -6 -> 25 -> 11 -> -27 -> 5
[2023-05-16 02:20:37] DEBUG: UPath 2: 14
[2023-05-16 02:20:37] DEBUG: UPath 3: 19
[2023-05-16 02:20:37] DEBUG: Final graph contains 3 egdes
[2023-05-16 02:20:37] DEBUG: Extending contigs into repeats
[2023-05-16 02:20:37] DEBUG: Covered 0 repetitive contigs
[2023-05-16 02:20:37] INFO: Generated 3 contigs
[2023-05-16 02:20:37] DEBUG: Writing FASTA
[2023-05-16 02:20:37] DEBUG: Generating scaffold connections
[2023-05-16 02:20:37] INFO: Added 0 scaffold connections
[2023-05-16 02:20:37] DEBUG: Writing Dot
[2023-05-16 02:20:37] DEBUG: Writing FASTA
[2023-05-16 02:20:37] DEBUG: Writing Gfa
[2023-05-16 02:20:37] DEBUG: Peak RAM usage: 0 Gb
-----------End assembly log------------
[2023-05-16 02:20:37] root: INFO: >>>STAGE: polishing
[2023-05-16 02:20:37] root: INFO: Polishing genome (1/1)
[2023-05-16 02:20:37] root: INFO: Running minimap2
[2023-05-16 02:22:55] root: INFO: Separating alignment into bubbles
[2023-05-16 02:27:18] root: DEBUG: Generated 328532 bubbles
[2023-05-16 02:27:18] root: DEBUG: Split 5 long bubbles
[2023-05-16 02:27:18] root: DEBUG: Skipped 0 empty bubbles
[2023-05-16 02:27:18] root: DEBUG: Skipped 1 bubbles with long branches
[2023-05-16 02:27:18] root: INFO: Alignment error rate: 0.064269
[2023-05-16 02:27:18] root: INFO: Correcting bubbles
[2023-05-16 02:31:34] root: DEBUG: Mean contig coverage: 56, selected threshold: 11
[2023-05-16 02:31:34] root: DEBUG: Filtered 0 contigs of total length 0
[2023-05-16 02:31:34] root: DEBUG: Generating polished GFA
[2023-05-16 02:31:38] root: DEBUG: 0 sequences remained unpolished
[2023-05-16 02:31:38] root: INFO: >>>STAGE: finalize
[2023-05-16 02:31:38] root: DEBUG: ---Output dir contents:----
[2023-05-16 02:31:38] root: DEBUG: Citrobacter_koseri_MINF/
[2023-05-16 02:31:38] root: DEBUG:     4.0 M       assembly.fasta
[2023-05-16 02:31:38] root: DEBUG:     489.0 B     assembly_graph.gv
[2023-05-16 02:31:38] root: DEBUG:     92.0 B      params.json
[2023-05-16 02:31:38] root: DEBUG:     4.0 M       assembly_graph.gfa
[2023-05-16 02:31:38] root: DEBUG:     36.0 K      flye.log
[2023-05-16 02:31:38] root: DEBUG:     279.0 M     final_filtered_long_reads.fastq.gz
[2023-05-16 02:31:38] root: DEBUG:     279.0 M     chopper_long_reads.fastq.gz
[2023-05-16 02:31:38] root: DEBUG:     1.0 K       Citrobacter_koseri_MINF_05162023_015813.log
[2023-05-16 02:31:38] root: DEBUG:     40-polishing/
[2023-05-16 02:31:38] root: DEBUG:         4.0 M       polished_edges.gfa
[2023-05-16 02:31:38] root: DEBUG:         84.0 B      filtered_contigs.fasta.fai
[2023-05-16 02:31:38] root: DEBUG:         1.0 K       minimap.stderr
[2023-05-16 02:31:38] root: DEBUG:         84.0 B      contigs_stats.txt
[2023-05-16 02:31:38] root: DEBUG:         2.0 K       edges_aln.bam.bai
[2023-05-16 02:31:38] root: DEBUG:         1.0 M       base_coverage.bed.gz
[2023-05-16 02:31:38] root: DEBUG:         84.0 B      filtered_stats.txt
[2023-05-16 02:31:38] root: DEBUG:         120.0 K     minimap_1.bam.bai
[2023-05-16 02:31:38] root: DEBUG:         4.0 M       filtered_contigs.fasta
[2023-05-16 02:31:38] root: DEBUG:     30-contigger/
[2023-05-16 02:31:38] root: DEBUG:         0.0 B       scaffolds_links.txt
[2023-05-16 02:31:38] root: DEBUG:         489.0 B     graph_final.gv
[2023-05-16 02:31:38] root: DEBUG:         180.0 B     contigs_stats.txt
[2023-05-16 02:31:38] root: DEBUG:         4.0 M       graph_final.fasta
[2023-05-16 02:31:38] root: DEBUG:         4.0 M       graph_final.gfa
[2023-05-16 02:31:38] root: DEBUG:         84.0 B      contigs.fasta.fai
[2023-05-16 02:31:38] root: DEBUG:         4.0 M       contigs.fasta
[2023-05-16 02:31:38] root: DEBUG:     10-consensus/
[2023-05-16 02:31:38] root: DEBUG:         1.0 K       minimap.stderr
[2023-05-16 02:31:38] root: DEBUG:         127.0 K     minimap.bam.bai
[2023-05-16 02:31:38] root: DEBUG:         4.0 M       consensus.fasta
[2023-05-16 02:31:38] root: DEBUG:     00-assembly/
[2023-05-16 02:31:38] root: DEBUG:         97.0 B      draft_assembly.fasta.fai
[2023-05-16 02:31:38] root: DEBUG:         4.0 M       draft_assembly.fasta
[2023-05-16 02:31:38] root: DEBUG:     20-repeat/
[2023-05-16 02:31:38] root: DEBUG:         4.0 K       repeat_graph_dump
[2023-05-16 02:31:38] root: DEBUG:         4.0 M       repeat_graph_edges.fasta
[2023-05-16 02:31:38] root: DEBUG:         1.0 K       graph_before_rr.gv
[2023-05-16 02:31:38] root: DEBUG:         2.0 K       graph_after_rr.gv
[2023-05-16 02:31:38] root: DEBUG:         4.0 M       graph_before_rr.fasta
[2023-05-16 02:31:38] root: DEBUG:         5.0 M       read_alignment_dump
[2023-05-16 02:31:38] root: DEBUG: --------------------------
[2023-05-16 02:31:38] root: INFO: Assembly statistics:

	Total length:	4840901
	Fragments:	3
	Fragments N50:	4757416
	Largest frg:	4757416
	Scaffolds:	0
	Mean coverage:	56

[2023-05-16 02:31:38] root: INFO: Final assembly: /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_1_THREADS/Citrobacter_koseri_MINF/assembly.fasta

8 Threads

[2023-05-16 03:04:52] root: INFO: Starting Flye 2.9.2-b1786
[2023-05-16 03:04:52] root: DEBUG: Cmd: /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/bin/flye --nano-raw ../real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/final_filtered_long_reads.fastq.gz --out-dir ../real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF --threads 8 --deterministic
[2023-05-16 03:04:52] root: DEBUG: Python version: 3.7.8 | packaged by conda-forge | (default, Jul 31 2020, 02:25:08) 
[GCC 7.5.0]
[2023-05-16 03:04:52] root: INFO: >>>STAGE: configure
[2023-05-16 03:04:52] root: INFO: Configuring run
[2023-05-16 03:04:57] root: INFO: Total read length: 278879070
[2023-05-16 03:04:57] root: INFO: Reads N50/N90: 16438 / 4092
[2023-05-16 03:04:57] root: INFO: Minimum overlap set to 4000
[2023-05-16 03:04:57] root: INFO: >>>STAGE: assembly
[2023-05-16 03:04:57] root: INFO: Assembling disjointigs
[2023-05-16 03:04:57] root: DEBUG: -----Begin assembly log------
[2023-05-16 03:04:57] root: DEBUG: Running: flye-modules assemble --reads /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/final_filtered_long_reads.fastq.gz --out-asm /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/00-assembly/draft_assembly.fasta --config /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg --log /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/flye.log --threads 1 --min-ovlp 4000
[2023-05-16 03:04:57] DEBUG: Build date: May 13 2023 06:30:35
[2023-05-16 03:04:57] DEBUG: Total RAM: 62 Gb
[2023-05-16 03:04:57] DEBUG: Available RAM: 58 Gb
[2023-05-16 03:04:57] DEBUG: Total CPUs: 16
[2023-05-16 03:04:57] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg
[2023-05-16 03:04:57] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_defaults.cfg
[2023-05-16 03:04:57] DEBUG: 	big_genome_threshold=29000000
[2023-05-16 03:04:57] DEBUG: 	meta_read_filter_kmer_freq=100
[2023-05-16 03:04:57] DEBUG: 	chain_large_gap_penalty=2
[2023-05-16 03:04:57] DEBUG: 	chain_small_gap_penalty=0.5
[2023-05-16 03:04:57] DEBUG: 	chain_gap_jump_threshold=100
[2023-05-16 03:04:57] DEBUG: 	max_coverage_drop_rate=5
[2023-05-16 03:04:57] DEBUG: 	max_extensions_drop_rate=5
[2023-05-16 03:04:57] DEBUG: 	chimera_window=100
[2023-05-16 03:04:57] DEBUG: 	chimera_overhang=1000
[2023-05-16 03:04:57] DEBUG: 	min_reads_in_disjointig=4
[2023-05-16 03:04:57] DEBUG: 	max_inner_reads=10
[2023-05-16 03:04:57] DEBUG: 	max_inner_fraction=0.25
[2023-05-16 03:04:57] DEBUG: 	max_separation=500
[2023-05-16 03:04:57] DEBUG: 	unique_edge_length=50000
[2023-05-16 03:04:57] DEBUG: 	min_repeat_res_support=0.51
[2023-05-16 03:04:57] DEBUG: 	out_paths_ratio=5
[2023-05-16 03:04:57] DEBUG: 	graph_cov_drop_rate=5
[2023-05-16 03:04:57] DEBUG: 	coverage_estimate_window=100
[2023-05-16 03:04:57] DEBUG: 	max_bubble_length=50000
[2023-05-16 03:04:57] DEBUG: 	loop_coverage_rate=1.5
[2023-05-16 03:04:57] DEBUG: 	repeat_edge_cov_mult=1.75
[2023-05-16 03:04:57] DEBUG: 	weak_detach_rate=5
[2023-05-16 03:04:57] DEBUG: 	tip_coverage_rate=2
[2023-05-16 03:04:57] DEBUG: 	tip_length_rate=2
[2023-05-16 03:04:57] DEBUG: 	output_gfa_before_rr=0
[2023-05-16 03:04:57] DEBUG: 	remove_alt_edges=0
[2023-05-16 03:04:57] DEBUG: 	low_cutoff_warning=1
[2023-05-16 03:04:57] DEBUG: 	kmer_size=17
[2023-05-16 03:04:57] DEBUG: 	use_minimizers=0
[2023-05-16 03:04:57] DEBUG: 	reads_base_alignment=0
[2023-05-16 03:04:57] DEBUG: 	meta_read_top_kmer_rate=0.40
[2023-05-16 03:04:57] DEBUG: 	maximum_jump=1500
[2023-05-16 03:04:57] DEBUG: 	maximum_overhang=1500
[2023-05-16 03:04:57] DEBUG: 	repeat_kmer_rate=100
[2023-05-16 03:04:57] DEBUG: 	assemble_ovlp_divergence=0.10
[2023-05-16 03:04:57] DEBUG: 	assemble_divergence_relative=1
[2023-05-16 03:04:57] DEBUG: 	repeat_graph_ovlp_divergence=0.08
[2023-05-16 03:04:57] DEBUG: 	read_align_ovlp_divergence=0.25
[2023-05-16 03:04:57] DEBUG: 	hpc_scoring_on=0
[2023-05-16 03:04:57] DEBUG: 	add_unassembled_reads=0
[2023-05-16 03:04:57] DEBUG: 	extend_contigs_with_repeats=0
[2023-05-16 03:04:57] DEBUG: 	min_read_cov_cutoff=3
[2023-05-16 03:04:57] DEBUG: 	short_tip_length=20000
[2023-05-16 03:04:57] DEBUG: 	long_tip_length=100000
[2023-05-16 03:04:57] DEBUG: Running with k-mer size: 17
[2023-05-16 03:04:57] DEBUG: Running with minimum overlap 4000
[2023-05-16 03:04:57] DEBUG: Metagenome mode: N
[2023-05-16 03:04:57] DEBUG: Short mode: N
[2023-05-16 03:04:57] INFO: Reading sequences
[2023-05-16 03:05:00] DEBUG: Building positional index
[2023-05-16 03:05:00] DEBUG: Total sequence: 251729388 bp
[2023-05-16 03:05:03] INFO: Counting k-mers:
[2023-05-16 03:05:52] DEBUG: Updating k-mer histogram
[2023-05-16 03:06:46] DEBUG: Hash size: 4622497
[2023-05-16 03:06:46] DEBUG: Total k-mers 88307420
[2023-05-16 03:06:46] INFO: Filling index table (1/2)
[2023-05-16 03:08:38] DEBUG: Mean k-mer frequency: 21.8363
[2023-05-16 03:08:38] DEBUG: Repetitive k-mer frequency: 2183
[2023-05-16 03:08:38] DEBUG: Filtered 18624 repetitive k-mers (0.000182569)
[2023-05-16 03:08:39] INFO: Filling index table (2/2)
[2023-05-16 03:10:36] DEBUG: Sorting k-mer index
[2023-05-16 03:10:36] DEBUG: Selected k-mers: 6031079
[2023-05-16 03:10:36] DEBUG: Index size: 103351604
[2023-05-16 03:10:36] DEBUG: Mean k-mer index frequency: 17.1365
[2023-05-16 03:10:36] DEBUG: Peak RAM usage: 9 Gb
[2023-05-16 03:10:36] DEBUG: Estimating k-mer identity bias
[2023-05-16 03:11:26] DEBUG: Initial divergence estimate : 0.0717829
[2023-05-16 03:11:26] DEBUG: Relative threshold: Y
[2023-05-16 03:11:26] DEBUG: Max divergence threshold set to 0.171783
[2023-05-16 03:11:26] INFO: Extending reads
[2023-05-16 03:11:26] DEBUG: Estimating overlap coverage
[2023-05-16 03:12:18] INFO: Overlap-based coverage: 49
[2023-05-16 03:12:18] INFO: Median overlap divergence: 0.0716492
[2023-05-16 03:12:18] DEBUG: Sequence divergence distribution: 

    |             *                    |                                                                 
    |             **                   |                                                                 
    |             **                   |                                                                 
    |             **                   |                                                                 
    |             **                   |                                                                 
    |            ***                   |                                                                 
    |            ***                   |                                                                 
    |            ****                  |                                                                 
    |            *****                 |                                                                 
    |            *****                 |                                                                 
    |            *****                 |                                                                 
    |            *****                 |                                                                 
    |           ******                 |                                                                 
    |           ******                 |                                                                 
    |           *******                |                                                                 
    |          **********              |                                                                 
    |         ***********              |                                                                 
    |         ************             |                                                                 
    |         **************           |                                                                 
    |     * **********************     *  ** * ** *       * *                                            
    ----------------------------------------------------------------------------------------------------
    0%        5%        10%       15%       20%       25%       30%       35%       40%       45%       

    Q25 = 0.064, Q50 = 0.072, Q75 = 0.082

[2023-05-16 03:13:27] DEBUG: Assembled disjointig 1
	With 373 reads
	Start read: +d0f225cb-3a25-4d2c-b63a-e5a8af2de32c
	At position: 369
	leftTip: 0 rightTip: 0
	Suspicious: 1
	Short ext: 1
	Mean extensions: 39
	Avg overlap len: 38582
	Min overlap len: 2009
	Inner reads: 0
	Length: 4797752
[2023-05-16 03:13:27] DEBUG: Inner: 32900 covered: 32974 total: 35900
[2023-05-16 03:13:35] DEBUG: Assembled disjointig 2
	With 21 reads
	Start read: +9f6116ca-576f-4d7f-b606-f3289ccd4a8f
	At position: 8
	leftTip: 0 rightTip: 0
	Suspicious: 1
	Short ext: 1
	Mean extensions: 61
	Avg overlap len: 56389
	Min overlap len: 2220
	Inner reads: 0
	Length: 188876
[2023-05-16 03:13:35] DEBUG: Inner: 33808 covered: 33890 total: 35900
[2023-05-16 03:13:37] DEBUG: Assembled disjointig 3
	With 21 reads
	Start read: +a57a2ff9-5e67-4ab8-be97-2b06a7698c24
	At position: 7
	leftTip: 0 rightTip: 0
	Suspicious: 0
	Short ext: 0
	Mean extensions: 66
	Avg overlap len: 7517
	Min overlap len: 6171
	Inner reads: 0
	Length: 42899
[2023-05-16 03:13:37] DEBUG: Inner: 34308 covered: 34392 total: 35900
[2023-05-16 03:13:57] INFO: Assembled 3 disjointigs
[2023-05-16 03:13:57] INFO: Generating sequence
[2023-05-16 03:14:03] DEBUG: Building positional index
[2023-05-16 03:14:03] DEBUG: Total sequence: 5030425 bp
[2023-05-16 03:14:05] DEBUG: Mean k-mer frequency: 1.03893
[2023-05-16 03:14:05] DEBUG: Repetitive k-mer frequency: 103
[2023-05-16 03:14:05] DEBUG: Filtered 0 repetitive k-mers (0)
[2023-05-16 03:14:06] DEBUG: Sorting k-mer index
[2023-05-16 03:14:07] DEBUG: Selected k-mers: 4841871
[2023-05-16 03:14:07] DEBUG: K-mer index size: 5030374
[2023-05-16 03:14:07] DEBUG: Mean k-mer frequency: 1.03893
[2023-05-16 03:14:07] DEBUG: Minimizer rate: 1.00001
[2023-05-16 03:14:07] INFO: Filtering contained disjointigs
[2023-05-16 03:14:09] DEBUG: Computing transitive closure for overlaps
[2023-05-16 03:14:09] DEBUG: Found 12 overlaps
[2023-05-16 03:14:09] DEBUG: Left 12 overlaps after filtering
[2023-05-16 03:14:09] INFO: Contained seqs: 0
[2023-05-16 03:14:09] DEBUG: Writing FASTA
[2023-05-16 03:14:09] DEBUG: Peak RAM usage: 9 Gb
-----------End assembly log------------
[2023-05-16 03:14:09] root: DEBUG: Disjointigs length: 5030425, N50: 4797882
[2023-05-16 03:14:09] root: INFO: >>>STAGE: consensus
[2023-05-16 03:14:09] root: INFO: Running Minimap2
[2023-05-16 03:14:35] root: INFO: Computing consensus
[2023-05-16 03:15:42] root: INFO: Alignment error rate: 0.104514
[2023-05-16 03:15:42] root: INFO: >>>STAGE: repeat
[2023-05-16 03:15:42] root: INFO: Building and resolving repeat graph
[2023-05-16 03:15:42] root: DEBUG: -----Begin repeat analyser log------
[2023-05-16 03:15:42] root: DEBUG: Running: flye-modules repeat --disjointigs /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/10-consensus/consensus.fasta --reads /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/final_filtered_long_reads.fastq.gz --out-dir /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/20-repeat --config /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg --log /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/flye.log --threads 8 --min-ovlp 4000
[2023-05-16 03:15:42] DEBUG: Build date: May 13 2023 06:30:59
[2023-05-16 03:15:42] DEBUG: Total RAM: 62 Gb
[2023-05-16 03:15:42] DEBUG: Available RAM: 58 Gb
[2023-05-16 03:15:42] DEBUG: Total CPUs: 16
[2023-05-16 03:15:42] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg
[2023-05-16 03:15:42] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_defaults.cfg
[2023-05-16 03:15:42] DEBUG: 	big_genome_threshold=29000000
[2023-05-16 03:15:42] DEBUG: 	meta_read_filter_kmer_freq=100
[2023-05-16 03:15:42] DEBUG: 	chain_large_gap_penalty=2
[2023-05-16 03:15:42] DEBUG: 	chain_small_gap_penalty=0.5
[2023-05-16 03:15:42] DEBUG: 	chain_gap_jump_threshold=100
[2023-05-16 03:15:42] DEBUG: 	max_coverage_drop_rate=5
[2023-05-16 03:15:42] DEBUG: 	max_extensions_drop_rate=5
[2023-05-16 03:15:42] DEBUG: 	chimera_window=100
[2023-05-16 03:15:42] DEBUG: 	chimera_overhang=1000
[2023-05-16 03:15:42] DEBUG: 	min_reads_in_disjointig=4
[2023-05-16 03:15:42] DEBUG: 	max_inner_reads=10
[2023-05-16 03:15:42] DEBUG: 	max_inner_fraction=0.25
[2023-05-16 03:15:42] DEBUG: 	max_separation=500
[2023-05-16 03:15:42] DEBUG: 	unique_edge_length=50000
[2023-05-16 03:15:42] DEBUG: 	min_repeat_res_support=0.51
[2023-05-16 03:15:42] DEBUG: 	out_paths_ratio=5
[2023-05-16 03:15:42] DEBUG: 	graph_cov_drop_rate=5
[2023-05-16 03:15:42] DEBUG: 	coverage_estimate_window=100
[2023-05-16 03:15:42] DEBUG: 	max_bubble_length=50000
[2023-05-16 03:15:42] DEBUG: 	loop_coverage_rate=1.5
[2023-05-16 03:15:42] DEBUG: 	repeat_edge_cov_mult=1.75
[2023-05-16 03:15:42] DEBUG: 	weak_detach_rate=5
[2023-05-16 03:15:42] DEBUG: 	tip_coverage_rate=2
[2023-05-16 03:15:42] DEBUG: 	tip_length_rate=2
[2023-05-16 03:15:42] DEBUG: 	output_gfa_before_rr=0
[2023-05-16 03:15:42] DEBUG: 	remove_alt_edges=0
[2023-05-16 03:15:42] DEBUG: 	low_cutoff_warning=1
[2023-05-16 03:15:42] DEBUG: 	kmer_size=17
[2023-05-16 03:15:42] DEBUG: 	use_minimizers=0
[2023-05-16 03:15:42] DEBUG: 	reads_base_alignment=0
[2023-05-16 03:15:42] DEBUG: 	meta_read_top_kmer_rate=0.40
[2023-05-16 03:15:42] DEBUG: 	maximum_jump=1500
[2023-05-16 03:15:42] DEBUG: 	maximum_overhang=1500
[2023-05-16 03:15:42] DEBUG: 	repeat_kmer_rate=100
[2023-05-16 03:15:42] DEBUG: 	assemble_ovlp_divergence=0.10
[2023-05-16 03:15:42] DEBUG: 	assemble_divergence_relative=1
[2023-05-16 03:15:42] DEBUG: 	repeat_graph_ovlp_divergence=0.08
[2023-05-16 03:15:42] DEBUG: 	read_align_ovlp_divergence=0.25
[2023-05-16 03:15:42] DEBUG: 	hpc_scoring_on=0
[2023-05-16 03:15:42] DEBUG: 	add_unassembled_reads=0
[2023-05-16 03:15:42] DEBUG: 	extend_contigs_with_repeats=0
[2023-05-16 03:15:42] DEBUG: 	min_read_cov_cutoff=3
[2023-05-16 03:15:42] DEBUG: 	short_tip_length=20000
[2023-05-16 03:15:42] DEBUG: 	long_tip_length=100000
[2023-05-16 03:15:42] DEBUG: Running with k-mer size: 17
[2023-05-16 03:15:42] DEBUG: Selected minimum overlap 4000
[2023-05-16 03:15:42] DEBUG: Metagenome mode: N
[2023-05-16 03:15:42] INFO: Parsing disjointigs
[2023-05-16 03:15:42] DEBUG: Building positional index
[2023-05-16 03:15:42] DEBUG: Total sequence: 5076617 bp
[2023-05-16 03:15:42] INFO: Building repeat graph
[2023-05-16 03:15:44] DEBUG: Mean k-mer frequency: 1.06798
[2023-05-16 03:15:44] DEBUG: Repetitive k-mer frequency: 106
[2023-05-16 03:15:44] DEBUG: Filtered 266 repetitive k-mers (5.23976e-05)
[2023-05-16 03:15:45] DEBUG: Sorting k-mer index
[2023-05-16 03:15:45] DEBUG: Selected k-mers: 4753412
[2023-05-16 03:15:45] DEBUG: K-mer index size: 5076300
[2023-05-16 03:15:45] DEBUG: Mean k-mer frequency: 1.06793
[2023-05-16 03:15:45] DEBUG: Minimizer rate: 1.00006
[2023-05-16 03:15:48] DEBUG: Computing transitive closure for overlaps
[2023-05-16 03:15:48] DEBUG: Found 252 overlaps
[2023-05-16 03:15:48] DEBUG: Left 100 overlaps after filtering
[2023-05-16 03:15:48] INFO: Median overlap divergence: 0.0385451
[2023-05-16 03:15:48] DEBUG: Sequence divergence distribution: 

    |*               |                                                                                   
    |*               |                                                                                   
    |*               |                                                                                   
    |*               |                                                                                   
    |*               |                                                                                   
    |*         *     |                                                                                   
    |*         *     |                                                                                   
    |*         *     |                                                                                   
    |*         *     |                                                                                   
    |*         *     |                                                                                   
    |*   *  *  *  *  |                             *                                                     
    |*   *  *  *  *  |                             *                                                     
    |*   *  *  *  *  |                             *                                                     
    |*   *  *  *  *  |                             *                                                     
    |*   *  *  *  *  |                             *                                                     
    |* ***  * **  *  |                             *                                                     
    |* ***  * **  *  |                             *                                                     
    |* ***  * **  *  |                             *                                                     
    |* ***  * **  *  |                             *                                                     
    |* ***  * **  *  |                             *                                                     
    ----------------------------------------------------------------------------------------------------
    0%        5%        10%       15%       20%       25%       30%       35%       40%       45%       

    Q25 = 0.015, Q50 = 0.039, Q75 = 0.052

[2023-05-16 03:15:48] DEBUG: Computing gluepoints
[2023-05-16 03:15:48] DEBUG: Added 0 gluepoint projections
[2023-05-16 03:15:48] DEBUG: Created 68 gluepoints
[2023-05-16 03:15:48] DEBUG: Artificial loops removed: 0 left, 0 right, 0 both
[2023-05-16 03:15:48] DEBUG: Initializing edges
[2023-05-16 03:15:48] DEBUG: Edges length checksum: 2273136170
[2023-05-16 03:15:48] DEBUG: Filtered 0 singleton segments
[2023-05-16 03:15:48] DEBUG: Removed 0 simple and 0 double chimeric junctions
[2023-05-16 03:15:48] DEBUG: Collapsed 3 edges
[2023-05-16 03:15:48] DEBUG: *	19	+disjointig_1	91498	612411	520913	
[2023-05-16 03:15:48] DEBUG:  	4	+disjointig_1	612411	617825	5414	
[2023-05-16 03:15:48] DEBUG: *	10	+disjointig_1	617825	1354882	737057	
[2023-05-16 03:15:48] DEBUG:  	12	+disjointig_1	1354882	1383671	28789	
[2023-05-16 03:15:48] DEBUG: *	-13	+disjointig_1	1383671	1661038	277367	
[2023-05-16 03:15:48] DEBUG:  	12	+disjointig_1	1661038	1689822	28784	
[2023-05-16 03:15:48] DEBUG: *	-11	+disjointig_1	1689822	3220286	1530464	
[2023-05-16 03:15:48] DEBUG:  	-4	+disjointig_1	3220286	3225683	5397	
[2023-05-16 03:15:48] DEBUG: *	6	+disjointig_1	3225683	3934891	709208	
[2023-05-16 03:15:48] DEBUG:  	-4	+disjointig_1	3934891	3940023	5132	
[2023-05-16 03:15:48] DEBUG: *	7	+disjointig_1	3940023	3972909	32886	
[2023-05-16 03:15:48] DEBUG:  	-4	+disjointig_1	3972909	3978219	5310	
[2023-05-16 03:15:48] DEBUG: *	8	+disjointig_1	3978219	4103369	125150	
[2023-05-16 03:15:48] DEBUG:  	-4	+disjointig_1	4103369	4108685	5316	
[2023-05-16 03:15:48] DEBUG: *	9	+disjointig_1	4108685	4193387	84702	
[2023-05-16 03:15:48] DEBUG:  	-4	+disjointig_1	4193387	4198537	5150	
[2023-05-16 03:15:48] DEBUG: *	5	+disjointig_1	4198537	4684469	485932	
[2023-05-16 03:15:48] DEBUG:  	4	+disjointig_1	4684469	4689761	5292	
[2023-05-16 03:15:48] DEBUG: *	18	+disjointig_1	4689761	4843327	153566	
[2023-05-16 03:15:48] DEBUG:  	20	+disjointig_2	61224	190410	129186	
[2023-05-16 03:15:48] DEBUG:  	21	+disjointig_3	6269	42863	36594	
[2023-05-16 03:15:48] DEBUG: Total edges: 21
[2023-05-16 03:15:48] INFO: Parsing reads
[2023-05-16 03:15:51] DEBUG: Building positional index
[2023-05-16 03:15:51] DEBUG: Total sequence: 278879070 bp
[2023-05-16 03:15:51] DEBUG: Building positional index
[2023-05-16 03:15:51] DEBUG: Total sequence: 4917609 bp
[2023-05-16 03:15:51] INFO: Aligning reads to the graph
[2023-05-16 03:15:52] DEBUG: Mean k-mer frequency: 1.03575
[2023-05-16 03:15:52] DEBUG: Repetitive k-mer frequency: 103
[2023-05-16 03:15:52] DEBUG: Filtered 262 repetitive k-mers (5.32822e-05)
[2023-05-16 03:15:53] DEBUG: Sorting k-mer index
[2023-05-16 03:15:53] DEBUG: Selected k-mers: 4747508
[2023-05-16 03:15:53] DEBUG: K-mer index size: 4916956
[2023-05-16 03:15:53] DEBUG: Mean k-mer frequency: 1.03569
[2023-05-16 03:15:53] DEBUG: Minimizer rate: 1.00013
[2023-05-16 03:16:14] DEBUG: Total reads : 17950
[2023-05-16 03:16:14] DEBUG: Read with aligned parts : 17458
[2023-05-16 03:16:14] DEBUG: Aligned in one piece : 17383
[2023-05-16 03:16:14] INFO: Aligned read sequence: 243823657 / 251729388 (0.968594)
[2023-05-16 03:16:14] INFO: Median overlap divergence: 0.0275118
[2023-05-16 03:16:14] DEBUG: Sequence divergence distribution: 

    |    *                                             |                                                 
    |    *                                             |                                                 
    |    *                                             |                                                 
    |    *                                             |                                                 
    |    **                                            |                                                 
    |    **                                            |                                                 
    |    **                                            |                                                 
    |    **                                            |                                                 
    |   ***                                            |                                                 
    |   ****                                           |                                                 
    |   ****                                           |                                                 
    |   ****                                           |                                                 
    |   *****                                          |                                                 
    |   *****                                          |                                                 
    |   *****                                          |                                                 
    |   ******                                         |                                                 
    |   *******                                        |                                                 
    |   ********                                       |                                                 
    |   **********                                     |                                                 
    |  ******************** *********************************  *    *                                    
    ----------------------------------------------------------------------------------------------------
    0%        5%        10%       15%       20%       25%       30%       35%       40%       45%       

    Q25 = 0.022, Q50 = 0.028, Q75 = 0.037

[2023-05-16 03:16:14] INFO: Mean edge coverage: 51
[2023-05-16 03:16:14] DEBUG: 4	len:5287	cov:398	mult:7.80392
[2023-05-16 03:16:14] DEBUG: -4	len:5287	cov:398	mult:7.80392
[2023-05-16 03:16:14] DEBUG: 5	len:485932	cov:56	mult:1.09804
[2023-05-16 03:16:14] DEBUG: -5	len:485932	cov:56	mult:1.09804
[2023-05-16 03:16:14] DEBUG: 6	len:709208	cov:49	mult:0.960784
[2023-05-16 03:16:14] DEBUG: -6	len:709208	cov:49	mult:0.960784
[2023-05-16 03:16:14] DEBUG: 7	len:32886	cov:60	mult:1.17647
[2023-05-16 03:16:14] DEBUG: -7	len:32886	cov:60	mult:1.17647
[2023-05-16 03:16:14] DEBUG: 8	len:125150	cov:55	mult:1.07843
[2023-05-16 03:16:14] DEBUG: -8	len:125150	cov:55	mult:1.07843
[2023-05-16 03:16:14] DEBUG: 9	len:84702	cov:51	mult:1
[2023-05-16 03:16:14] DEBUG: -9	len:84702	cov:51	mult:1
[2023-05-16 03:16:14] DEBUG: 10	len:737057	cov:48	mult:0.941176
[2023-05-16 03:16:14] DEBUG: -10	len:737057	cov:48	mult:0.941176
[2023-05-16 03:16:14] DEBUG: 11	len:1530464	cov:47	mult:0.921569
[2023-05-16 03:16:14] DEBUG: -11	len:1530464	cov:47	mult:0.921569
[2023-05-16 03:16:14] DEBUG: 12	len:28786	cov:101	mult:1.98039
[2023-05-16 03:16:14] DEBUG: -12	len:28786	cov:101	mult:1.98039
[2023-05-16 03:16:14] DEBUG: 13	len:277367	cov:46	mult:0.901961
[2023-05-16 03:16:14] DEBUG: -13	len:277367	cov:46	mult:0.901961
[2023-05-16 03:16:14] DEBUG: 18	len:153566	cov:51	mult:1
[2023-05-16 03:16:14] DEBUG: -18	len:153566	cov:51	mult:1
[2023-05-16 03:16:14] DEBUG: 19	len:520913	cov:50	mult:0.980392
[2023-05-16 03:16:14] DEBUG: -19	len:520913	cov:50	mult:0.980392
[2023-05-16 03:16:14] DEBUG: 20	len:64593	cov:112	mult:2.19608
[2023-05-16 03:16:14] DEBUG: -20	len:64593	cov:112	mult:2.19608
[2023-05-16 03:16:14] DEBUG: 21	len:18297	cov:109	mult:2.13725
[2023-05-16 03:16:14] DEBUG: -21	len:18297	cov:109	mult:2.13725
[2023-05-16 03:16:14] DEBUG: Unique coverage threshold 96
[2023-05-16 03:16:14] INFO: Simplifying the graph
[2023-05-16 03:16:14] DEBUG: Read coverage cutoff: 10
[2023-05-16 03:16:14] DEBUG: [SIMPL] Removed 0 paths with low coverage
[2023-05-16 03:16:14] DEBUG: [SIMPL] Masked 0 heterozygous loops
[2023-05-16 03:16:14] DEBUG: [SIMPL] Masked 0 simple bubbles
[2023-05-16 03:16:14] DEBUG: Finding repeats
[2023-05-16 03:16:14] DEBUG: Read coverage cutoff: 10
[2023-05-16 03:16:14] DEBUG: High-cov: 4	5287	398
[2023-05-16 03:16:14] DEBUG: High-cov: 12	28786	101
[2023-05-16 03:16:14] DEBUG: High-cov: 20	64593	112
[2023-05-16 03:16:14] DEBUG: High-cov: 21	18297	109
[2023-05-16 03:16:14] DEBUG: Repeat detection iteration 1
[2023-05-16 03:16:14] DEBUG: Writing Dot
[2023-05-16 03:16:14] DEBUG: Writing FASTA
[2023-05-16 03:16:14] DEBUG: [SIMPL] == Iteration 1 ==
[2023-05-16 03:16:14] DEBUG: Splitting nodes
[2023-05-16 03:16:14] DEBUG: [SIMPL] Split 0 nodes
[2023-05-16 03:16:14] DEBUG: [SIMPL] Clipped 0 short and 0 long tips
[2023-05-16 03:16:14] DEBUG: [SIMPL] Masked 0 heterozygous loops
[2023-05-16 03:16:14] DEBUG: [SIMPL] Masked 0 simple bubbles
[2023-05-16 03:16:14] DEBUG: Finding repeats
[2023-05-16 03:16:14] DEBUG: Read coverage cutoff: 10
[2023-05-16 03:16:14] DEBUG: High-cov: 4	5287	398
[2023-05-16 03:16:14] DEBUG: High-cov: 12	28786	101
[2023-05-16 03:16:14] DEBUG: High-cov: 20	64593	112
[2023-05-16 03:16:14] DEBUG: High-cov: 21	18297	109
[2023-05-16 03:16:14] DEBUG: Repeat detection iteration 1
[2023-05-16 03:16:14] DEBUG: Total unique edges: 10
[2023-05-16 03:16:14] DEBUG: 	Connection -13	-11	7	1
[2023-05-16 03:16:14] DEBUG: 	Connection 13	-10	3	1
[2023-05-16 03:16:14] DEBUG: 	Connection 8	9	33	1
[2023-05-16 03:16:14] DEBUG: 	Connection -11	6	29	1
[2023-05-16 03:16:14] DEBUG: 	Connection 19	10	25	1
[2023-05-16 03:16:14] DEBUG: 	Connection 9	5	23	1
[2023-05-16 03:16:14] DEBUG: 	Connection -7	-6	31	1
[2023-05-16 03:16:14] DEBUG: 	Connection -18	-5	29	1
[2023-05-16 03:16:14] DEBUG: 	Connection -8	-7	37	1
[2023-05-16 03:16:14] DEBUG: [SIMPL] Resolved repeats: 9
[2023-05-16 03:16:14] DEBUG: RR links: 434
[2023-05-16 03:16:14] DEBUG: Unresolved: 0
[2023-05-16 03:16:14] DEBUG: Removed 0 simple and 0 double chimeric junctions
[2023-05-16 03:16:14] DEBUG: [SIMPL] == Iteration 2 ==
[2023-05-16 03:16:14] DEBUG: Splitting nodes
[2023-05-16 03:16:14] DEBUG: [SIMPL] Split 0 nodes
[2023-05-16 03:16:14] DEBUG: [SIMPL] Clipped 0 short and 0 long tips
[2023-05-16 03:16:14] DEBUG: [SIMPL] Masked 0 heterozygous loops
[2023-05-16 03:16:14] DEBUG: [SIMPL] Masked 0 simple bubbles
[2023-05-16 03:16:14] DEBUG: Finding repeats
[2023-05-16 03:16:14] DEBUG: Read coverage cutoff: 10
[2023-05-16 03:16:14] DEBUG: High-cov: 20	64593	112
[2023-05-16 03:16:14] DEBUG: High-cov: 21	18297	109
[2023-05-16 03:16:14] DEBUG: Repeat detection iteration 1
[2023-05-16 03:16:14] DEBUG: Total unique edges: 19
[2023-05-16 03:16:14] DEBUG: [SIMPL] Resolved repeats: 0
[2023-05-16 03:16:14] DEBUG: RR links: 0
[2023-05-16 03:16:14] DEBUG: Unresolved: 0
[2023-05-16 03:16:14] DEBUG: Removed 0 simple and 0 double chimeric junctions
[2023-05-16 03:16:14] DEBUG: [SIMPL] Collapsed 0 haplotypes
[2023-05-16 03:16:14] DEBUG: [SIMPL] Resolved 0 simple repeats
[2023-05-16 03:16:14] DEBUG: Read coverage cutoff: 10
[2023-05-16 03:16:14] DEBUG: [SIMPL] Removed 0 paths with low coverage
[2023-05-16 03:16:14] DEBUG: Finding repeats
[2023-05-16 03:16:14] DEBUG: Read coverage cutoff: 10
[2023-05-16 03:16:14] DEBUG: High-cov: 20	64593	112
[2023-05-16 03:16:14] DEBUG: High-cov: 21	18297	109
[2023-05-16 03:16:14] DEBUG: Repeat detection iteration 1
[2023-05-16 03:16:14] DEBUG: Writing Dot
[2023-05-16 03:16:14] DEBUG: Writing FASTA
[2023-05-16 03:16:14] DEBUG: Peak RAM usage: 0 Gb
-----------End assembly log------------
[2023-05-16 03:16:14] root: INFO: >>>STAGE: contigger
[2023-05-16 03:16:14] root: INFO: Generating contigs
[2023-05-16 03:16:14] root: DEBUG: -----Begin contigger analyser log------
[2023-05-16 03:16:14] root: DEBUG: Running: flye-modules contigger --graph-edges /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/20-repeat/repeat_graph_edges.fasta --reads /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/final_filtered_long_reads.fastq.gz --out-dir /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/30-contigger --config /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg --repeat-graph /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/20-repeat/repeat_graph_dump --graph-aln /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/20-repeat/read_alignment_dump --log /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/flye.log --threads 8 --min-ovlp 4000
[2023-05-16 03:16:14] DEBUG: Build date: May 13 2023 06:31:26
[2023-05-16 03:16:14] DEBUG: Total RAM: 62 Gb
[2023-05-16 03:16:14] DEBUG: Available RAM: 58 Gb
[2023-05-16 03:16:14] DEBUG: Total CPUs: 16
[2023-05-16 03:16:14] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg
[2023-05-16 03:16:14] DEBUG: Loading /data/plassembler_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/plassembler_simulation_benchmarking/workflow/conda/54abddd56e3fba19b3e8f87a90e31246_/lib/python3.7/site-packages/flye/config/bin_cfg/asm_defaults.cfg
[2023-05-16 03:16:14] DEBUG: 	big_genome_threshold=29000000
[2023-05-16 03:16:14] DEBUG: 	meta_read_filter_kmer_freq=100
[2023-05-16 03:16:14] DEBUG: 	chain_large_gap_penalty=2
[2023-05-16 03:16:14] DEBUG: 	chain_small_gap_penalty=0.5
[2023-05-16 03:16:14] DEBUG: 	chain_gap_jump_threshold=100
[2023-05-16 03:16:14] DEBUG: 	max_coverage_drop_rate=5
[2023-05-16 03:16:14] DEBUG: 	max_extensions_drop_rate=5
[2023-05-16 03:16:14] DEBUG: 	chimera_window=100
[2023-05-16 03:16:14] DEBUG: 	chimera_overhang=1000
[2023-05-16 03:16:14] DEBUG: 	min_reads_in_disjointig=4
[2023-05-16 03:16:14] DEBUG: 	max_inner_reads=10
[2023-05-16 03:16:14] DEBUG: 	max_inner_fraction=0.25
[2023-05-16 03:16:14] DEBUG: 	max_separation=500
[2023-05-16 03:16:14] DEBUG: 	unique_edge_length=50000
[2023-05-16 03:16:14] DEBUG: 	min_repeat_res_support=0.51
[2023-05-16 03:16:14] DEBUG: 	out_paths_ratio=5
[2023-05-16 03:16:14] DEBUG: 	graph_cov_drop_rate=5
[2023-05-16 03:16:14] DEBUG: 	coverage_estimate_window=100
[2023-05-16 03:16:14] DEBUG: 	max_bubble_length=50000
[2023-05-16 03:16:14] DEBUG: 	loop_coverage_rate=1.5
[2023-05-16 03:16:14] DEBUG: 	repeat_edge_cov_mult=1.75
[2023-05-16 03:16:14] DEBUG: 	weak_detach_rate=5
[2023-05-16 03:16:14] DEBUG: 	tip_coverage_rate=2
[2023-05-16 03:16:14] DEBUG: 	tip_length_rate=2
[2023-05-16 03:16:14] DEBUG: 	output_gfa_before_rr=0
[2023-05-16 03:16:14] DEBUG: 	remove_alt_edges=0
[2023-05-16 03:16:14] DEBUG: 	low_cutoff_warning=1
[2023-05-16 03:16:14] DEBUG: 	kmer_size=17
[2023-05-16 03:16:14] DEBUG: 	use_minimizers=0
[2023-05-16 03:16:14] DEBUG: 	reads_base_alignment=0
[2023-05-16 03:16:14] DEBUG: 	meta_read_top_kmer_rate=0.40
[2023-05-16 03:16:14] DEBUG: 	maximum_jump=1500
[2023-05-16 03:16:14] DEBUG: 	maximum_overhang=1500
[2023-05-16 03:16:14] DEBUG: 	repeat_kmer_rate=100
[2023-05-16 03:16:14] DEBUG: 	assemble_ovlp_divergence=0.10
[2023-05-16 03:16:14] DEBUG: 	assemble_divergence_relative=1
[2023-05-16 03:16:14] DEBUG: 	repeat_graph_ovlp_divergence=0.08
[2023-05-16 03:16:14] DEBUG: 	read_align_ovlp_divergence=0.25
[2023-05-16 03:16:14] DEBUG: 	hpc_scoring_on=0
[2023-05-16 03:16:14] DEBUG: 	add_unassembled_reads=0
[2023-05-16 03:16:14] DEBUG: 	extend_contigs_with_repeats=0
[2023-05-16 03:16:14] DEBUG: 	min_read_cov_cutoff=3
[2023-05-16 03:16:14] DEBUG: 	short_tip_length=20000
[2023-05-16 03:16:14] DEBUG: 	long_tip_length=100000
[2023-05-16 03:16:14] DEBUG: Running with k-mer size: 17
[2023-05-16 03:16:14] DEBUG: Selected minimum overlap 4000
[2023-05-16 03:16:14] INFO: Reading sequences
[2023-05-16 03:16:17] DEBUG: Building positional index
[2023-05-16 03:16:17] DEBUG: Total sequence: 278879070 bp
[2023-05-16 03:16:17] DEBUG: Flipped 0
[2023-05-16 03:16:17] DEBUG: UPath 1: -29 -> 18 -> 19 -> -26 -> 10 -> 23 -> -13 -> 22 -> -11 -> 25 -> 6 -> -28 -> 7 -> 30 -> 8 -> -24 -> 9 -> 27 -> 5
[2023-05-16 03:16:17] DEBUG: UPath 2: 20
[2023-05-16 03:16:17] DEBUG: UPath 3: 21
[2023-05-16 03:16:17] DEBUG: Final graph contains 3 egdes
[2023-05-16 03:16:17] DEBUG: Extending contigs into repeats
[2023-05-16 03:16:17] DEBUG: Covered 0 repetitive contigs
[2023-05-16 03:16:17] INFO: Generated 3 contigs
[2023-05-16 03:16:17] DEBUG: Writing FASTA
[2023-05-16 03:16:17] DEBUG: Generating scaffold connections
[2023-05-16 03:16:17] INFO: Added 0 scaffold connections
[2023-05-16 03:16:17] DEBUG: Writing Dot
[2023-05-16 03:16:17] DEBUG: Writing FASTA
[2023-05-16 03:16:17] DEBUG: Writing Gfa
[2023-05-16 03:16:17] DEBUG: Peak RAM usage: 0 Gb
-----------End assembly log------------
[2023-05-16 03:16:17] root: INFO: >>>STAGE: polishing
[2023-05-16 03:16:17] root: INFO: Polishing genome (1/1)
[2023-05-16 03:16:17] root: INFO: Running minimap2
[2023-05-16 03:16:37] root: INFO: Separating alignment into bubbles
[2023-05-16 03:17:58] root: DEBUG: Generated 328466 bubbles
[2023-05-16 03:17:58] root: DEBUG: Split 5 long bubbles
[2023-05-16 03:17:58] root: DEBUG: Skipped 1 empty bubbles
[2023-05-16 03:17:58] root: DEBUG: Skipped 1 bubbles with long branches
[2023-05-16 03:17:58] root: INFO: Alignment error rate: 0.064162
[2023-05-16 03:17:58] root: INFO: Correcting bubbles
[2023-05-16 03:18:35] root: DEBUG: Mean contig coverage: 56, selected threshold: 11
[2023-05-16 03:18:35] root: DEBUG: Filtered 0 contigs of total length 0
[2023-05-16 03:18:35] root: DEBUG: Generating polished GFA
[2023-05-16 03:18:38] root: DEBUG: 0 sequences remained unpolished
[2023-05-16 03:18:38] root: INFO: >>>STAGE: finalize
[2023-05-16 03:18:38] root: DEBUG: ---Output dir contents:----
[2023-05-16 03:18:38] root: DEBUG: Citrobacter_koseri_MINF/
[2023-05-16 03:18:38] root: DEBUG:     1.0 K       Citrobacter_koseri_MINF_05162023_030414.log
[2023-05-16 03:18:38] root: DEBUG:     4.0 M       assembly.fasta
[2023-05-16 03:18:38] root: DEBUG:     487.0 B     assembly_graph.gv
[2023-05-16 03:18:38] root: DEBUG:     92.0 B      params.json
[2023-05-16 03:18:38] root: DEBUG:     4.0 M       assembly_graph.gfa
[2023-05-16 03:18:38] root: DEBUG:     36.0 K      flye.log
[2023-05-16 03:18:38] root: DEBUG:     279.0 M     final_filtered_long_reads.fastq.gz
[2023-05-16 03:18:38] root: DEBUG:     279.0 M     chopper_long_reads.fastq.gz
[2023-05-16 03:18:38] root: DEBUG:     40-polishing/
[2023-05-16 03:18:38] root: DEBUG:         4.0 M       polished_edges.gfa
[2023-05-16 03:18:38] root: DEBUG:         84.0 B      filtered_contigs.fasta.fai
[2023-05-16 03:18:38] root: DEBUG:         1.0 K       minimap.stderr
[2023-05-16 03:18:38] root: DEBUG:         84.0 B      contigs_stats.txt
[2023-05-16 03:18:38] root: DEBUG:         2.0 K       edges_aln.bam.bai
[2023-05-16 03:18:38] root: DEBUG:         1.0 M       base_coverage.bed.gz
[2023-05-16 03:18:38] root: DEBUG:         84.0 B      filtered_stats.txt
[2023-05-16 03:18:38] root: DEBUG:         120.0 K     minimap_1.bam.bai
[2023-05-16 03:18:38] root: DEBUG:         4.0 M       filtered_contigs.fasta
[2023-05-16 03:18:38] root: DEBUG:     30-contigger/
[2023-05-16 03:18:38] root: DEBUG:         0.0 B       scaffolds_links.txt
[2023-05-16 03:18:38] root: DEBUG:         487.0 B     graph_final.gv
[2023-05-16 03:18:38] root: DEBUG:         180.0 B     contigs_stats.txt
[2023-05-16 03:18:38] root: DEBUG:         4.0 M       graph_final.fasta
[2023-05-16 03:18:38] root: DEBUG:         4.0 M       graph_final.gfa
[2023-05-16 03:18:38] root: DEBUG:         84.0 B      contigs.fasta.fai
[2023-05-16 03:18:38] root: DEBUG:         4.0 M       contigs.fasta
[2023-05-16 03:18:38] root: DEBUG:     10-consensus/
[2023-05-16 03:18:38] root: DEBUG:         1.0 K       minimap.stderr
[2023-05-16 03:18:38] root: DEBUG:         128.0 K     minimap.bam.bai
[2023-05-16 03:18:38] root: DEBUG:         4.0 M       consensus.fasta
[2023-05-16 03:18:38] root: DEBUG:     00-assembly/
[2023-05-16 03:18:38] root: DEBUG:         97.0 B      draft_assembly.fasta.fai
[2023-05-16 03:18:38] root: DEBUG:         4.0 M       draft_assembly.fasta
[2023-05-16 03:18:38] root: DEBUG:     20-repeat/
[2023-05-16 03:18:38] root: DEBUG:         4.0 K       repeat_graph_dump
[2023-05-16 03:18:38] root: DEBUG:         4.0 M       repeat_graph_edges.fasta
[2023-05-16 03:18:38] root: DEBUG:         1.0 K       graph_before_rr.gv
[2023-05-16 03:18:38] root: DEBUG:         2.0 K       graph_after_rr.gv
[2023-05-16 03:18:38] root: DEBUG:         4.0 M       graph_before_rr.fasta
[2023-05-16 03:18:38] root: DEBUG:         5.0 M       read_alignment_dump
[2023-05-16 03:18:38] root: DEBUG: --------------------------
[2023-05-16 03:18:38] root: INFO: Assembly statistics:

	Total length:	4840225
	Fragments:	3
	Fragments N50:	4757419
	Largest frg:	4757419
	Scaffolds:	0
	Mean coverage:	56

[2023-05-16 03:18:38] root: INFO: Final assembly: /data/plassembler_benchmarking/real_benchmarking_deterministic/REAL/PLASSEMBLER_OUTPUT_8_THREADS/Citrobacter_koseri_MINF/assembly.fasta

gbouras13 avatar May 16 '23 04:05 gbouras13