canu
canu copied to clipboard
Unsure if this indicates memory leak in `splitHaplotype`
I've been trying to build our cloud-native pipelines using trioCanu, and think this might be useful for the canu developers.
Show below is the resource usage during splitHaplotype
.
Note that the memory usage "drifted" higher from the first to the last "hump"s. Is this expected? If unexpected, I'm not sure how high a priority this is for the team (not a high priority for us), but I think this might be useful in case you're interested in looking into it.
I've attached the shell script, the out file and the log below. Archive.zip
Nice! I think this is expected.
The kmer database is split into 64 pieces, which can be loaded in parallel, one per thread. There are two databases to load - 128 total pieces - so four humps are expected when 36 threads are available. The first two humps should be the first database, the second two the second. The growth is hopefully just the kmers being loaded. Reasonably expected.
The humps themselves are probably overhead in unpacking the compressed kmer data. I'm a little surprised at how big that is...but is seems to be released when done, so no damage.
Do you know what the first 16 minutes was doing? Just downloading data? 36 cores seems overkill here. I'd expect using 16 cores, or maybe even just 8, would be better. It looks like 20 minutes of loading data for a one minute compute. For full impact, I'd have to support computing with 8 threads, but loading all 64 pieces in parallel. Easy to do, just need another command line option (that I might not expose to canu though).
@brianwalenz Sorry it took me so long to get back to you! (multiple projects...)
It's relieving to know this is expected!
Do you know what the first 16 minutes was doing? Just downloading data?
No, all data has been localized and put in the right place before canu is called.
Below is the relevant part of the log file. (Please bear in mind that this is my personal fork of canu 1.9, with only the Perl part of workflow control changed). You can see that the first minute is spent on configuring the child long read splitting job. Then not much log is available.
Wed Jan 22 16:13:12 UTC 2020
Limiting hap memory to: 29 G
-- Canu branch sh_fix_k_31 +25 changes (r9633 7e2fd9ae685e1c67848a973ef3ea7858d2ae1a0b)
--
-- CITATIONS
--
-- Koren S, Walenz BP, Berlin K, Miller JR, Phillippy AM.
-- Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation.
-- Genome Res. 2017 May;27(5):722-736.
-- http://doi.org/10.1101/gr.215087.116
--
-- Koren S, Rhie A, Walenz BP, Dilthey AT, Bickhart DM, Kingan SB, Hiendleder S, Williams JL, Smith TPL, Phillippy AM.
-- De novo assembly of haplotype-resolved genomes with trio binning.
-- Nat Biotechnol. 2018
-- https//doi.org/10.1038/nbt.4277
--
-- Read and contig alignments during correction, consensus and GFA building use:
-- Šošic M, Šikic M.
-- Edlib: a C/C ++ library for fast, exact sequence alignment using edit distance.
-- Bioinformatics. 2017 May 1;33(9):1394-1395.
-- http://doi.org/10.1093/bioinformatics/btw753
--
-- Overlaps are generated using:
-- Berlin K, et al.
-- Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.
-- Nat Biotechnol. 2015 Jun;33(6):623-30.
-- http://doi.org/10.1038/nbt.3238
--
-- Myers EW, et al.
-- A Whole-Genome Assembly of Drosophila.
-- Science. 2000 Mar 24;287(5461):2196-204.
-- http://doi.org/10.1126/science.287.5461.2196
--
-- Corrected read consensus sequences are generated using an algorithm derived from FALCON-sense:
-- Chin CS, et al.
-- Phased diploid genome assembly with single-molecule real-time sequencing.
-- Nat Methods. 2016 Dec;13(12):1050-1054.
-- http://doi.org/10.1038/nmeth.4035
--
-- Contig consensus sequences are generated using an algorithm derived from pbdagcon:
-- Chin CS, et al.
-- Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.
-- Nat Methods. 2013 Jun;10(6):563-9
-- http://doi.org/10.1038/nmeth.2474
--
-- CONFIGURE CANU
--
-- Detected Java(TM) Runtime Environment '1.8.0_232' (from '/usr/local/openjdk-8/bin/java') with -d64 support.
--
-- WARNING:
-- WARNING: Failed to run gnuplot using command 'gnuplot'.
-- WARNING: Plots will be disabled.
-- WARNING:
--
-- Detected 36 CPUs and 32 gigabytes of memory.
-- No grid engine detected, grid disabled.
--
-- (tag)Concurrency
-- (tag)Threads |
-- (tag)Memory | |
-- (tag) | | | total usage algorithm
-- ------- ------ -------- -------- ----------------- -----------------------------
-- Local: hap 29 GB 36 CPUs x 1 job 29 GB 36 CPUs (read-to-haplotype assignment)
--
-- Found PacBio uncorrected reads in the input files.
--
-- Generating assembly 'giab' in '/cromwell_root/workdir'
--
-- Parameters:
--
-- genomeSize 3100000000
--
-- Overlap Generation Limits:
-- corOvlErrorRate 0.2400 ( 24.00%)
-- obtOvlErrorRate 0.0450 ( 4.50%)
-- utgOvlErrorRate 0.0450 ( 4.50%)
--
-- Overlap Processing Limits:
-- corErrorRate 0.3000 ( 30.00%)
-- obtErrorRate 0.0450 ( 4.50%)
-- utgErrorRate 0.0450 ( 4.50%)
-- cnsErrorRate 0.0750 ( 7.50%)
--
-- USING k = 21
--
--
--
-- BEGIN ASSIGNING CHILD LONG READS
--
--
-- Finished stage 'haplotypeReadsConfigure', reset canuIteration.
-- New report created.
--
-- Running jobs. First attempt out of 2.
----------------------------------------
-- Starting 'hap' concurrent execution on Wed Jan 22 16:14:30 2020 with 144.726 GB free disk space (1 processes; 1 concurrently)
cd haplotype
./splitHaplotype.sh 1 > ./splitHaplotype.000001.out 2>&1
-- Finished on Wed Jan 22 20:43:42 2020 (16152 seconds) with 118.17 GB free disk space
----------------------------------------
-- Found 1 haplotyping outputs.
-- No change in report.
-- Finished stage 'haplotype-reads', reset canuIteration.
Stop requested after 'haplotype'.
The out
file generated from running the shell script is here, but no time is given so I really don't know what's happening during the first 16 minutes.
Found perl:
/usr/bin/perl
This is perl 5, version 24, subversion 1 (v5.24.1) built for x86_64-linux-gnu-thread-multi
Found java:
/usr/local/openjdk-8/bin/java
Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/cromwell_root/tmp.e0308cfc
Found canu:
/opt/canu/Linux-amd64/bin/canu
Canu branch sh_fix_k_31 +25 changes (r9633 7e2fd9ae685e1c67848a973ef3ea7858d2ae1a0b)
Running job 1 based on command line options.
--
-- Loading haplotype data, using up to 14 GB memory for each.
--
-- Haplotype './0-kmers/haplotype-Father.meryl':
-- use kmers with frequency at least 11.
p prefixes bits gigabytes (allowed: 14 GB)
-- -------------- ---------------- ---------
17 131072 5217764780 0.607
18 262144 5158499152 0.601
19 524288 5107622132 0.595
20 1048576 5073522328 0.591
21 2097152 5072976956 0.591 (smallest)
22 4194304 5139540448 0.598
23 8388608 5340321668 0.622
24 16777216 5809538344 0.676
25 33554432 6815625932 0.793
26 67108864 8895455344 1.036
27 134217728 13122768404 1.528
28 268435456 21645048760 2.520
29 536870912 38757263708 4.512
30 1073741824 73049347840 8.504 (used)
31 2147483648 141701170340 16.496
32 4294967296 279072469576 32.488
33 8589934592 553882722284 64.480
34 17179869184 1103570881936 128.473
-- -------------- ---------------- ---------
For 67654236 distinct 31-mers (with 30 bits used for indexing and 32 bits for tags):
8.504 GB memory
8.000 GB memory for index (1073741824 elements 64 bits wide)
0.252 GB memory for tags (67654236 elements 32 bits wide)
0.252 GB memory for data (67654236 elements 32 bits wide)
Will load 67654236 kmers. Skipping 12511147671 (too low) and 0 (too high) kmers.
Allocating space for 67654236 suffixes of 32 bits each -> 2164935552 bits (0.252 GB) in blocks of 32.000 MB
67654236 values of 32 bits each -> 2164935552 bits (0.252 GB) in blocks of 32.000 MB
Loaded 67654236 kmers. Skipped 12511147671 (too low) and 0 (too high) kmers.
-- loaded 67654236 kmers.
-- Haplotype './0-kmers/haplotype-Mother.meryl':
-- use kmers with frequency at least 13.
p prefixes bits gigabytes (allowed: 14 GB)
-- -------------- ---------------- ---------
16 65536 4434263272 0.516
17 131072 4381661820 0.510
18 262144 4333254672 0.504
19 524288 4293236132 0.500
20 1048576 4269994808 0.497 (smallest)
21 2097152 4280307916 0.498
22 4194304 4357729888 0.507
23 8388608 4569369588 0.532
24 16777216 5049444744 0.588
25 33554432 6066390812 0.706
26 67108864 8157078704 0.950
27 134217728 12395250244 1.443
28 268435456 20928389080 2.436
29 536870912 38051462508 4.430 (used)
30 1073741824 72354405120 8.423
31 2147483648 141017086100 16.417
32 4294967296 278399243816 32.410
33 8589934592 553220355004 64.403
-- -------------- ---------------- ---------
For 56795756 distinct 31-mers (with 29 bits used for indexing and 33 bits for tags):
4.430 GB memory
4.000 GB memory for index (536870912 elements 64 bits wide)
0.218 GB memory for tags (56795756 elements 33 bits wide)
0.212 GB memory for data (56795756 elements 32 bits wide)
Will load 56795756 kmers. Skipping 12911257945 (too low) and 0 (too high) kmers.
Allocating space for 56795756 suffixes of 33 bits each -> 1874259948 bits (0.218 GB) in blocks of 32.000 MB
56795756 values of 32 bits each -> 1817464192 bits (0.212 GB) in blocks of 32.000 MB
Loaded 56795756 kmers. Skipped 12911257945 (too low) and 0 (too high) kmers.
-- loaded 56795756 kmers.
-- Data loaded.
--
-- Processing reads in batches of 100 reads each.
--
-- Bye.