canu icon indicating copy to clipboard operation
canu copied to clipboard

Canu 2.1/tip crashes with more than 4096 input files

Open ardy20 opened this issue 3 years ago • 14 comments

Hi I tried to install Canu 2.1.1 as instructed below:

git clone https://github.com/marbl/canu.git cd canu/src make -j 2

But I get the error: Makefile:35: *** git '/usr/bin/git' version '1.8.3.1' too old; at least version 2.12 is required. Stop.

What the problem might be? Regards

ardy20 avatar Mar 03 '21 03:03 ardy20

git 1.8, as I recall, doesn't support submodules correctly. Canu uses this support to download three additional bits of source code from other projects.

All the required source code is included in canu-2.1.1.tar.xz from https://github.com/marbl/canu/releases/tag/v2.1.1. That shouldn't need a newer version of git, but I'm not sure if it's ever been tested.

The two downloads called "Source code" there also do not include all the code; this is a git/github problem we cannot fix.

brianwalenz avatar Mar 03 '21 04:03 brianwalenz

Thanks Where can I find earlier versions? Regards

ardy20 avatar Mar 03 '21 04:03 ardy20

I downloaded canu-2.1.1.tar.xz and I still get the same git error.

ardy20 avatar Mar 03 '21 04:03 ardy20

I was afraid of that. Two options:

  1. upgrade git
  2. in src/Makefile, comment out the first five lines, or just the $(error git ... version .. too old ...) line)
gitv := $(shell git --version | cut -d\  -f 3 | cut -c 1)
ifeq (1, $(gitv))
  gitv := $(shell git --version | cut -d\  -f 3)
  $(error git '$(shell which git)' version '$(gitv)' too old; at least version 2.12 is required)
endif

brianwalenz avatar Mar 03 '21 04:03 brianwalenz

Hi I tried commenting out all five lines and also $(error git..... But I got the error again: Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).

fatal: Not a git repository (or any parent up to mount point /gpfs1) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).

fatal: Not a git repository (or any parent up to mount point /gpfs1) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).

Makefile:707: meryl/src/meryl/meryl.mk: No such file or directory Failed to open 'utility/src/utility/version.H.new' for writing: No such file or directory Building release v2.1.1 For 'Linux' '3.10.0-693.5.2.el7.x86_64' as 'amd64' into '/gpfs1/scratch/30days/canu-2.1.1/build/{bin,obj}'. Using '/usr/bin/g++' version '4.8.5'.

Makefile:889: *** missing `endif'. Stop.

ardy20 avatar Mar 03 '21 04:03 ardy20

Where can I find canu-2.1.1.tar.gz?

ardy20 avatar Mar 03 '21 04:03 ardy20

Did you download the tar.xz file rather than cloning when editing the Makefile? I confirmed that if I download the tar.xz file into a new folder from the releases page @brianwalenz linked and comment out the lines checking the git version:

#gitv := $(shell git --version | cut -d\  -f 3 | cut -c 1)
#ifeq (1, $(gitv))
#  gitv := $(shell git --version | cut -d\  -f 3)
#  $(error git '$(shell which git)' version '$(gitv)' too old; at least version 2.12 is required)
#endif

it will build w/o error. Make sure the extracted tar.xz file has the utilities/src/Makefile folder.

You could also try the pre-compiled binaries from the same release page which should work across common systems without having to compile first.

skoren avatar Mar 03 '21 20:03 skoren

Any update, were you able to build or use the pre-compiled binary?

skoren avatar Mar 15 '21 16:03 skoren

Hi Unfortunately, I could not get Canu 2.1 run. It was installed as module in our HPC system and I module load canu/2.1. But I got the following error. I search Canu github and found a similar report. By following up that comment, I decided to assemble with version 1.9. It is running currently. Because I thought the failure might be due to poor quality ONT reads that I got (N50 4.5 kb and 20-30X coverage) However, for your perusal I add the error message below:

-- BEGIN CORRECTION
--
----------------------------------------
-- Starting command on Mon Mar 15 08:13:13 2021 with 41683.875 GB free disk space

    cd .
    ./canu.seqStore.sh \
    > ./canu.seqStore.err 2>&1

-- Finished on Mon Mar 15 08:30:44 2021 (1051 seconds) with 41679.873 GB free disk space
----------------------------------------

ERROR:
ERROR:  Failed with exit code 1.  (rc=256)
ERROR:

ABORT:
ABORT: canu 2.1
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting.  If that doesn't work, ask for help.
ABORT:
ABORT:   sqStoreCreate failed; boom!.
ABORT:
ABORT: Disk space available:  41679.873 GB
ABORT:
ABORT: Last 50 lines of the relevant log file (./canu.seqStore.err):
ABORT:
ABORT:   ---------- --------- ------ ------------ ------
ABORT:   Loaded          2346  58.6%      9419671  90.9%  /gpfs1/scratch/30days/canu2.1/fastq_pass/PAF31885_pass_3567afb4_4683.fastq
ABORT:   Short           1654  41.4%       941631   9.1%
ABORT:
ABORT:
ABORT:   Creating library 'PAF31885_pass_3567afb4_4684' for Nanopore raw reads.
ABORT:
ABORT:                  reads               bases
ABORT:   ---------- --------- ------ ------------ ------
ABORT:   Loaded          2419  60.5%      9706550  91.6%  /gpfs1/scratch/30days/canu2.1/fastq_pass/PAF31885_pass_3567afb4_4684.fastq
ABORT:   Short           1581  39.5%       888493   8.4%
ABORT:
ABORT:
ABORT:   Creating library 'PAF31885_pass_3567afb4_4685' for Nanopore raw reads.
ABORT:
ABORT:                  reads               bases
ABORT:   ---------- --------- ------ ------------ ------
ABORT:   Loaded          2400  60.0%      9273597  90.9%  /gpfs1/scratch/30days/canu2.1/fastq_pass/PAF31885_pass_3567afb4_4685.fastq
ABORT:   Short           1600  40.0%       928616   9.1%
ABORT:
ABORT:
ABORT:   Creating library 'PAF31885_pass_3567afb4_4686' for Nanopore raw reads.
ABORT:
ABORT:                  reads               bases
ABORT:   ---------- --------- ------ ------------ ------
ABORT:   Loaded          2375  59.4%      9310647  91.1%  /gpfs1/scratch/30days/canu2.1/fastq_pass/PAF31885_pass_3567afb4_4686.fastq
ABORT:   Short           1625  40.6%       912252   8.9%
ABORT:
ABORT:
ABORT:   Creating library 'PAF31885_pass_3567afb4_4687' for Nanopore raw reads.
ABORT:
ABORT:                  reads               bases
ABORT:   ---------- --------- ------ ------------ ------
ABORT:   sqStoreCreate: stores/sqRead.H:263: void sqReadMeta::sqReadMeta_initialize(uint32, uint32): Assertion `_libraryID == libraryID' failed.
ABORT:
ABORT:   Failed with 'Aborted'; backtrace (libbacktrace):
ABORT:   utility/src/utility/system-stackTrace.C::83 in _Z17AS_UTL_catchCrashiP7siginfoPv()
ABORT:   (null)::0 in (null)()
ABORT:   (null)::0 in (null)()
ABORT:   (null)::0 in (null)()
ABORT:   (null)::0 in (null)()
ABORT:   (null)::0 in (null)()
ABORT:   stores/sqRead.H::263 in _ZN10sqReadMeta21sqReadMeta_initializeEjj()
ABORT:   stores/sqStore.C::211 in _ZN7sqStore20sqStore_addEmptyReadEP9sqLibraryPKc()
ABORT:   stores/sqStoreCreate.C::254 in _Z9loadReadsP7sqStoreP9sqLibraryjjP8_IO_FILES4_PcR9loadStats()
ABORT:   stores/sqStoreCreate.C::345 in _Z11createStorePKcRSt6vectorI6seqLibSaIS2_EEj()
ABORT:   stores/sqStoreCreate.C::662 in main()
ABORT:   (null)::0 in (null)()
ABORT:   (null)::0 in (null)()
ABORT:   ./canu.seqStore.sh: line 5706: 37833 Aborted 

ardy20 avatar Mar 15 '21 23:03 ardy20

I would not recommend using 1.9, it is quite old and will produce lower quality assemblies than 2.1.1. It is OK to use 2.1 but it has some bugs which were since fixed in 2.1.1 I'd either use the precompiled binaries or ask your IT to update the package.

As for the error, I expect you have too many files/libraries which are limited to 2^12 or 4096 files. I confirmed this crash happens on the 4096th file:

Creating library 'file_4094' for Nanopore raw reads.

               reads               bases
---------- --------- ------ ------------ ------
Loaded             0   0.0%            0   0.0%  /vf/users/korens/test/bug1901/unitigging/5-consensus/tmp/file_4094.fasta
Short              1 100.0%           12 100.0%


Creating library 'file_4095' for Nanopore raw reads.

               reads               bases
---------- --------- ------ ------------ ------
Loaded             0   0.0%            0   0.0%  /vf/users/korens/test/bug1901/unitigging/5-consensus/tmp/file_4095.fasta
Short              1 100.0%           12 100.0%


Creating library 'file_4096' for Nanopore raw reads.

               reads               bases
---------- --------- ------ ------------ ------
sqStoreCreate: stores/sqRead.H:289: void sqReadMeta::sqReadMeta_initialize(uint32, uint32): Assertion `_libraryID == libraryID' failed.

This wasn't an issue in 1.9 which is why it ran. If you have more than 4096 files, I'd suggest consolidating them into fewer files and then re-starting with 2.1.1.

skoren avatar Mar 16 '21 00:03 skoren

Hi Thanks for the advise. i concatenate the files and it works good.

ardy20 avatar Mar 16 '21 02:03 ardy20

Hi One question about running Canu 2.1.1. Do I need to point the path of Canu to /src in the qsub file (submission file)? Regards

ardy20 avatar Apr 06 '21 03:04 ardy20

No, the src is uncompiled code. You would run canu from either the bin/canu or /build/bin/canu, depending on if you downloaded the pre-build binaries or the source code (see the releases page). If you don't have a build or bin folder then you need to first compile canu or download the pre-compiled binaries instead which is generally easier.

skoren avatar Apr 06 '21 11:04 skoren

Thanks

ardy20 avatar Apr 07 '21 01:04 ardy20