kraken2
kraken2 copied to clipboard
No preliminary seqid/taxid mapping files found, aborting.
when I build nr database ,the error:
$ kraken2-build --build --db ~/software/metadb
Creating sequence ID to taxonomy ID map (step 1)...
No preliminary seqid/taxid mapping files found, aborting.
I don't know how to solve this problem。
How did you download/build the database?
that command: kraken2-build --download-taxonomy --threads 8 --protein --db ~/software/metadb/ --use-ftp
I deleted the original directory and re-downloaded and now encountered this error:
Untarring taxonomy tree data...
gzip: stdin: invalid compressed data--format violated
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
and I download taxdump.tar.gz from NCBI instead of the kraken2 download taxdump.tar.gz file,and run**kraken2-build --download-taxonomy --threads 8 --protein --db ~/software/metadb/ --use-ftp ** again .it show below:
Untarring taxonomy tree data... done.
then the nr database download before and move in library/nr,now the Directory Structure:
├── library
│ └── nr
│ └── nr.gz
└── taxonomy
├── accmap.dlflag
├── citations.dmp
├── delnodes.dmp
├── division.dmp
├── gc.prt
├── gencode.dmp
├── merged.dmp
├── names.dmp
├── nodes.dmp
├── prelim_map.txt
├── prot.accession2taxid
├── readme.txt
├── taxdump.dlflag
├── taxdump.tar.gz
├── taxdump.untarflag
but error still exist:
$ kraken2-build --build --db ~/software/metadb
Creating sequence ID to taxonomy ID map (step 1)...
No preliminary seqid/taxid mapping files found, aborting.
How did you download the nr.gz library file?
Hi, I am stuck in the same error, but with the nt database. Both the nt and taxonomy files were downloaded as per below:
kraken2-build --use-ftp --download-taxonomy --db nt kraken2-build --use-ftp --download-library nt --db nt --threads 12
The error I get after attempting to build the libraries is as follows: Creating sequence ID to taxonomy ID map (step 1)... No preliminary seqid/taxid mapping files found, aborting.
Any help will be very appreciated!
Hi, I am stuck in the same error as well... Did anybody find a solution?
I'm stucked too. anyone can help?
@fjptwenger Hi would you solve this problem?
Same problem today with the nt DB. I downloaded the files manually again and they look the same.
accession accession.version taxid gi A00001 A00001.1 10641 58418 A00002 A00002.1 9913 2 A00003 A00003.1 9913 3 A00004 A00004.1 32630 57971 A00005 A00005.1 32630 57972 A00006 A00006.1 32630 57973 A00008 A00008.1 32630 57974 A00009 A00009.1 32630 57975 A00010 A00010.1 32630 57976 A00011 A00011.1 32630 57977 A00012 A00012.1 32630 57978 A00013 A00013.1 32630 57979 A00014 A00014.1 32630 57980 A00015 A00015.1 32630 57981 A00016 A00016.1 32630 57982
If there is a solution, please let me know. Thanks
Any solution? I need to solve this problem
Not from my side, I am still waiting for answers. I tried to treat the nt fasta file as new data and add it to the db but it complained about something else.
same here when trying to build the NT - it has downloaded the nt.gz and the taxonomy in a separate folder and gives the same error!
- is there no solution?
Hi is there any solution to this problem, when i run this command line kraken2-build --build --db database it prompts:
Creating sequence ID to taxonomy ID map (step 1)... No preliminary seqid/taxid mapping files found, aborting.
For the download of taxonomy i used: kraken2-build --download-taxonomy --db database and for the download of the library: kraken2-build --download-library bacteria --db database
I solved this problem using kaiju instead kraken2
Damn, just encountered the same issue. I guess this is a dead end :/
I also stopped at this problem. It seems that you need to convert the assembly_summary.txt files to the .fna format within the db/library folder. Or download manually.
Guys, I think I found the solution to the problem.
"Creating sequence ID to taxonomy ID map (step 1)... No preliminary seqid/taxid mapping files found, aborting."
Today I spent the whole afternoon working on this and discovered that the problem is actually in the fasta file header. I discovered this because I have a script that adds the IDs to the respective fasta files, and it was not adding them correctly. Then I went on to find out why. Problems within my CSV files. Then I found out that my other script that creates the library for the database was pointing to another folder that did not have the fasta files with the correct header. Therefore, make sure that the header of your fasta files is correct.
This is the file as it was and why it was giving me the same error as you.
Lonsdalea_populi__DSM_25466__KU531470 GGACGGGTGAGTAATGTCTGGGGATCTGCCCGATGGAGGGGGATAACTACTGGAAACGGTAGCTAATACCGCATAACGTCGCAAGACCAAAGTGGGGGACCTTAGGGCCTCACACCATCGGATGAACCCAGATGGGATTAGCTAGTAGGCGGGGTAAGAGCCCACCTAGGCGACGATCTCTAGCTGGTCTGAGAGGATGACCAGCCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGGGAAACCCTGATGCAGCCATGCCGCGTGTATGAAGAAGGCCTTCGGGTTGTAAAGTACTTTCAGCGGGGAGGAAGGCAGTGAACTTAATACGTTCGCTGATTGACGTTACCCGCAGAAGAAGCACCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGTTAATCGGAATGACTGGGCGTAAAGCGCACGCAGGCGGTTTGTTAAGTCAGATGTGAAATCCCCGAGCTTAACTTGGGAACTGCATTTGAAACTGGCAGGCTAGAGTCTTGTAGAGGGGGGTAGAATTCCAGGTGTAGCGGTGAAATGCGTAGAGATCTGGAGGAATACCGGTGGCGAAGGCGGCCCCCTGGACAAAGACTGACGCTCAGGTGCGAAAGCGTGGGG
This is the file with the correct header that worked for me.
kraken:taxid|1172565|Lonsdalea_populi__DSM_25466__KU531470 GGACGGGTGAGTAATGTCTGGGGATCTGCCCGATGGAGGGGGATAACTACTGGAAACGGTAGCTAATACCGCATAACGTCGCAAGACCAAAGTGGGGGACCTTAGGGCCTCACACCATCGGATGAACCCAGATGGGATTAGCTAGTAGGCGGGGTAAGAGCCCACCTAGGCGACGATCTCTAGCTGGTCTGAGAGGATGACCAGCCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGGGAAACCCTGATGCAGCCATGCCGCGTGTATGAAGAAGGCCTTCGGGTTGTAAAGTACTTTCAGCGGGGAGGAAGGCAGTGAACTTAATACGTTCGCTGATTGACGTTACCCGCAGAAGAAGCACCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGTTAATCGGAATGACTGGGCGTAAAGCGCACGCAGGCGGTTTGTTAAGTCAGATGTGAAATCCCCGAGCTTAACTTGGGAACTGCATTTGAAACTGGCAGGCTAGAGTCTTGTAGAGGGGGGTAGAATTCCAGGTGTAGCGGTGAAATGCGTAGAGATCTGGAGGAATACCGGTGGCGAAGGCGGCCCCCTGGACAAAGACTGACGCTCAGGTGCGAAAGCGTGGGG