krakenuniq icon indicating copy to clipboard operation
krakenuniq copied to clipboard

more NCBI download errors

Open alpapan opened this issue 2 years ago • 1 comments

krakenuniq-download  --db $DBDIR/plants --threads 10 --min-seq-len 10000 --dust refseq/plant && krakenuniq-build --db $DBDIR/plants --threads 10 --kmer-len 31 --taxids-for-genomes --taxids-for-sequences
Downloading assembly summary file for plant genomes, and filtering to assembly level Complete_Genome.
Error fetching ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/plant/assembly_summary.txt. Is curl installed?

yes curl is fine, that's the wrong error message.

$ curl --version
curl 7.68.0 (x86_64-pc-linux-gnu) libcurl/7.68.0 OpenSSL/1.1.1f zlib/1.2.11 brotli/1.0.7 libidn2/2.2.0 libpsl/0.21.0 (+libidn2/2.2.0) libssh/0.9.3/openssl/zlib nghttp2/1.40.0 librtmp/2.3
Release-Date: 2020-01-08
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtmp rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: AsynchDNS brotli GSS-API HTTP2 HTTPS-proxy IDN IPv6 Kerberos Largefile libz NTLM NTLM_WB PSL SPNEGO SSL TLS-SRP UnixSockets

the issue is with NCBI:

curl ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/plant/assembly_summary.txt
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
curl: (9) Server denied you to change to the given directory

#this works fine

curl ftp://ftp.ncbi.nlm.nih.gov/genomes/.vol2/refseq/plant/assembly_summary.txt
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 46602  100 46602    0     0  10569      0  0:00:04  0:00:04 --:--:-- 10569


alpapan avatar Nov 28 '21 03:11 alpapan

Hi, I was having the same problem, but if you change "ftp://ftp.ncbi.nlm.nih.gov" to "https://ftp.ncbi.nlm.nih.gov" in the script krakenuniq-download it works.

Depending on which database you're interested on, you might want to change it several times :

  • line 40 : my $FTP="https://ftp.ncbi.nih.gov";
  • BUT also line 697, 1152.. and maybe more but I didn't use the script for every possible databases and didn't replaced everything automatically : example : my $url = "https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/accession2taxid/$_.accession2taxid.gz";

clescoat avatar Apr 29 '22 09:04 clescoat