krakenuniq icon indicating copy to clipboard operation
krakenuniq copied to clipboard

krakenuniq-build error

Open Sheerlik opened this issue 1 year ago • 6 comments

Hi,

I received an error when attempting to perform krakenuniq-build on the refseq genomes.

This is the command I use: ./scripts/krakenuniq-build --db DBDIR-temp --kmer-len 31 --threads 10 --taxids-for-genomes --taxids-for-sequences --jellyfish-bin 1

This is the error message: Kraken build set to minimize disk writes. Finding all library files Found 1 sequence files (*.{fna,fa,ffn,fasta,fsa}) in the library directory. Creating k-mer set (step 1 of 6)... Using 1 /home/ubuntu/krakenuniq-1.0.4/scripts/build_db.sh: line 127: count_unique: command not found xargs: cat: terminated by signal 13

(1.0.4 krakenuniq version)

Thank you! Sheerli

Sheerlik avatar Dec 05 '23 14:12 Sheerlik

./scripts/krakenuniq-build --db DBDIR-temp --kmer-len 31 --threads 10 --taxids-for-genomes --taxids-for-sequences --jellyfish-bin 1

If you specify --jellyfish-bin, it should be the path to the jellyfish v1 executable, not "1". I'm not sure if this is related to your error, since count_unique should be part of the krakenuniq install and is run prior to running jellyfish.

jvolkening avatar Dec 06 '23 05:12 jvolkening

Thank you Jeremy for you response!

We did that-

krakenuniq-1.0.4_2 ./krakenuniq-build --db ./DBDIRmicrobial-nt/ --kmer-len 31 --threads 50 --taxids-for-genomes --taxids-for-sequences --jellyfish-bin

and now received a different error-

Kraken build set to minimize disk writes. Found 1 sequence files (*.{fna,fa,ffn,fasta,fsa}) in the library directory. Creating k-mer set (step 1 of 6)... Using . Hash size not specified, using '53187024' /home/ubuntu/krakenuniq-1.0.4_2/build_db.sh: line 46: count: No such file or directory xargs: cat: terminated by signal 13

Please advise. Thank you! Sheerli

Sheerlik avatar Dec 06 '23 08:12 Sheerlik

Hello Sheerli,

This is still due to having an empty value for --jellyfish-bin in your command. You need to have Jellyfish version 1 installed to use the KrakenUniq build commands. If it is installed and in your PATH already, you don't need to specify --jellyfish-bin on the command line. For instance, if you run jellyfish --version in a shell, do you get a version number or a 'No such file or directory' error? If the latter, you need to install jellyfish v.1 and then specify the path, e.g. --jellyfish-bin /path/to/jellyfish, substituting for the second part the actual path to the binary file.

The easiest thing to do, in my opinion, is to install KrakenUniq through a package manager like Conda -- that will handle installing all of the dependencies for you.

jvolkening avatar Dec 06 '23 10:12 jvolkening

Thank you Jeremy for your prompt answer!

You are right, our jellyfish version was version 2 (although jellysfish was not installed separately from krakenuniq). We installed jellyfish-1.1.12.

ran the command: ./krakenuniq-build --db ./DBDIRmicrobial-nt/ --kmer-len 31 --threads 30 --jellyfish-bin /home/ubuntu/krakenuniq-1.0.4_2/jellyfish-1.1.12/bin/jellyfish

and got this error: Kraken build set to minimize disk writes. Found 1 sequence files (*.{fna,fa,ffn,fasta,fsa}) in the library directory. Creating k-mer set (step 1 of 6)... Using /home/ubuntu/krakenuniq-1.0.4_2/jellyfish-1.1.12/bin/jellyfish Hash size not specified, using '53187024' Can't merge hashes with different reprobing stratgies K-mer set created. [14.016s] Skipping step 2, no database reduction requested. Sorting k-mer set (step 3 of 6)... db_sort: Getting database into memory ...db_sort: unable to open database.jdb: No such file or directory

Perhaps you would know how to solve this problem?

Thank you! Sheerli

Sheerlik avatar Dec 06 '23 13:12 Sheerlik

Can't merge hashes with different reprobing stratgies

This is a jellyfish error -- I've not encountered it before. Are you sure you're not running out of disk space for the temporary files?

Hash size not specified, using '53187024'

This seems quite small based on my experience; if your input database if large I think this will result in a large number of temporary files. Our build process specifies the hash size explicitly, and you could try this to see if it makes any difference. For instance, on an machine with 128GB RAM we use --jellyfish-hash-size 15000000000, which seems to be about the max possible without running out of memory. For smaller or larger instances we adjust the value proportionally.

db_sort: unable to open database.jdb: No such file or directory

Almost certainly due to the previous jellyfish error, so the merged database was not written.

jvolkening avatar Dec 06 '23 14:12 jvolkening

Thank you Jeremy! We changed the hash size and it looks like its working. I will let you know if we encounter additional build problems.

Sheerlik avatar Dec 11 '23 08:12 Sheerlik