kraken2 icon indicating copy to clipboard operation
kraken2 copied to clipboard

Kraken2 build fails

Open sachinharle opened this issue 10 months ago • 6 comments

I get the following error when building kraken2 database: rsync: link_stat "/all/GCF/037/914/965/GCF_037914965.1_ASM3791496v1/GCF_037914965.1_ASM3791496v1_genomic.fna.gz" (in genomes) failed: No such file or directory (2)

'GCF_037914965.1_ASM3791496v1_genomic.fna.gz', this exact file does not exist at : https://ftp.ncbi.nlm.nih.gov/genomes//all/GCF/037/914/965/GCF_037914965.1_ASM3791496v1/

Please help me out here

sachinharle avatar Apr 19 '24 04:04 sachinharle

Im in the same boat. Im trying to trouble shoot now to see if I can come up with a fix as no one has responded yet. Of course, this looks like your request was put in on Friday, so maybe they'll get back to us today.

DeaconOfBiology avatar Apr 22 '24 14:04 DeaconOfBiology

Interesting, this unfortunately happens when NCBI includes a link in their files that does not connect to an actual file. Kraken2 just uses the data provided by NCBI to determine which filepaths to download. I do not have a solution except to suggest downloading the full standard pre-built database here: https://benlangmead.github.io/aws-indexes/k2

jenniferlu717 avatar Apr 22 '24 14:04 jenniferlu717

thank you for the reply. I used the prebuilt file as suggested from : [(https://benlangmead.github.io/aws-indexes/k2)]

When building barcken database with following command: racken-build -d /media/fgl/Data/Databases/kraken2/k2_standard_20240112 -t 96 -k 35 -l 76

it gives error ERROR: Database taxonomy /media/fgl/Data/Databases/kraken2/k2_standard_20240112/taxonomy/nodes.dmp does not exist

where can I get or generate nodes.dmp file?

sachinharle avatar Apr 23 '24 05:04 sachinharle

I found the solution for my question : get or generate nodes.dmp file? I renamed ktaxonomy.tsv file available with prebuilt database k2_standard_20240112 to nodes.dmp and put it in taxonomy folder and it worked. thanks again.

sachinharle avatar Apr 23 '24 06:04 sachinharle

Hi @sachinharle,

could you please share what you have inside your database folder? I tried to do it like you did, but I am having issues:

here is mine: database100mers.kmer_distrib database150mers.kmer_distrib database200mers.kmer_distrib database250mers.kmer_distrib database300mers.kmer_distrib database50mers.kmer_distrib database75mers.kmer_distrib hash.k2d inspect.txt k2_standard_08gb_20240605.tar.gz library_report.tsv opts.k2d seqid2taxid.map standard08gb.md5 taxo.k2d taxonomy unmapped_accessions.txt

and inside taxonomy

nodes.dmp

but when I run kraken build, i get the following:

./kraken2-build --build --db db_prebuilt/ -t 6 Can't find library/ subdirectory in database directory, exiting.

Thank you very much!

Best Rodolfo

rbtoscan avatar Jul 17 '24 11:07 rbtoscan

Hi @rbtoscan, I'm no expert. But my folder is structured as follows: k2_standard_20240112 ├── database100mers.kmer_distrib ├── database150mers.kmer_distrib ├── database200mers.kmer_distrib ├── database250mers.kmer_distrib ├── database300mers.kmer_distrib ├── database50mers.kmer_distrib ├── database75mers.kmer_distrib ├── database76mers.kmer_distrib ├── database76mers.kraken ├── database.kraken ├── hash.k2d ├── inspect.txt ├── ktaxonomy.tsv ├── library │   ├── ktaxonomy.tsv │   └── library_report.tsv ├── library_report.tsv ├── opts.k2d ├── seqid2taxid.map ├── taxo.k2d ├── taxonomy │   ├── ktaxonomy.tsv │   └── nodes.dmp └── unmapped_accessions.txt

Try to replicate the same

sachinharle avatar Jul 18 '24 06:07 sachinharle