krakenuniq
krakenuniq copied to clipboard
more NCBI download errors
krakenuniq-download --db $DBDIR/plants --threads 10 --min-seq-len 10000 --dust refseq/plant && krakenuniq-build --db $DBDIR/plants --threads 10 --kmer-len 31 --taxids-for-genomes --taxids-for-sequences
Downloading assembly summary file for plant genomes, and filtering to assembly level Complete_Genome.
Error fetching ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/plant/assembly_summary.txt. Is curl installed?
yes curl is fine, that's the wrong error message.
$ curl --version
curl 7.68.0 (x86_64-pc-linux-gnu) libcurl/7.68.0 OpenSSL/1.1.1f zlib/1.2.11 brotli/1.0.7 libidn2/2.2.0 libpsl/0.21.0 (+libidn2/2.2.0) libssh/0.9.3/openssl/zlib nghttp2/1.40.0 librtmp/2.3
Release-Date: 2020-01-08
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtmp rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: AsynchDNS brotli GSS-API HTTP2 HTTPS-proxy IDN IPv6 Kerberos Largefile libz NTLM NTLM_WB PSL SPNEGO SSL TLS-SRP UnixSockets
the issue is with NCBI:
curl ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/plant/assembly_summary.txt
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
curl: (9) Server denied you to change to the given directory
#this works fine
curl ftp://ftp.ncbi.nlm.nih.gov/genomes/.vol2/refseq/plant/assembly_summary.txt
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 46602 100 46602 0 0 10569 0 0:00:04 0:00:04 --:--:-- 10569
Hi, I was having the same problem, but if you change "ftp://ftp.ncbi.nlm.nih.gov" to "https://ftp.ncbi.nlm.nih.gov" in the script krakenuniq-download
it works.
Depending on which database you're interested on, you might want to change it several times :
- line 40 :
my $FTP="https://ftp.ncbi.nih.gov";
- BUT also line 697, 1152.. and maybe more but I didn't use the script for every possible databases and didn't replaced everything automatically :
example :
my $url = "https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/accession2taxid/$_.accession2taxid.gz";