kingfisher-download icon indicating copy to clipboard operation
kingfisher-download copied to clipboard

Method ena-ascp ena-ftp failed

Open ShailNair opened this issue 2 years ago • 8 comments

Hi I would like to try Kingfisher package with a bunch of ENA and SRA files. I installed aspera connect and kingfisher. When I run it I get following error

python ~/bin/kingfisher get --run-identifiers-list acession_ids-ena.txt \ --download-methods ena-ftp \ --download-threads 30 \ --extraction-threads 30 \ -> [OptionHandlerImpl.cc:184] errorCode=1 max-connection-per-server must be between 1 and 16. 用法: -x, --max-connection-per-server=N Maximum number of connections to a single server per download. Possible values: 1-16 Default: 1 Tags: #basic, #http, #ftp 04/27/2022 11:10:23 AM WARNING: Method ena-ftp failed, error was Command 'aria2c -x30 -o ERR1755873_1.fastq.gz 'ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR175/003/ERR1755873/ERR1755873_1.fastq.gz'' returned non-zero exit status 28. 04/27/2022 11:10:23 AM WARNING: Method ena-ftp failed

If i select ena-ascp i get the following error

WARNING: Error downloading from ENA with ASCP: Command ascp -T -l 300m -P33001 -k 2 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh [email protected]:/vol1/fastq/ERR175/003/ERR1755873/ERR1755873_1.fastq.gz . returned non-zero exit status 127. STDERR was: b'bash: ascp: \xe6\x9c\xaa\xe6\x89\xbe\xe5\x88\xb0\xe5\x91\xbd\xe4\xbb\xa4\n'STDOUT was: b'' 04/27/2022 11:20:52 AM WARNING: Method ena-ascp failed

Also how to provide multiple download methods. I tried -m ena-ascp,ena-ftp,prefetch and it throws

error: argument -m/--download_methods/--download-methods: invalid choice: error

ShailNair avatar Apr 27 '22 03:04 ShailNair

Looks like you can only specify up to 16 threads for aria2c, you asked for 30.

Does running ascp work for you outside kingfisher? Sometimes it doesn't get added to the $PATH

To try multiple methods, space-separate them not comma-separate, like this

-m ena-ascp ena-ftp prefetch

I hadn't realised about the 16 thread limit on aria2c

wwood avatar Apr 27 '22 05:04 wwood

Thanks, Dr. Ben. Indeed it was the issue with thread numbers and ascp bin directory not being added to the PATH. It is working now.

Thanks for the awsome package and quick reply.

ShailNair avatar Apr 27 '22 06:04 ShailNair

Hi,

The command ran well and after downloading about half of the ENA ids threw following error.

Attempting download method ena-ascp for run ERR1879685 .. 04/29/2022 05:22:50 AM INFO: Using aspera ssh key file: $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh 04/29/2022 05:22:50 AM INFO: Querying ENA for FTP paths for ERR1879685.. Traceback (most recent call last): File "/home/mcs/soft/kingfisher/bin/kingfisher", line 290, in <module> main() File "/home/mcs/soft/kingfisher/bin/kingfisher", line 240, in main kingfisher.download_and_extract( File "/home/mcs/soft/kingfisher/bin/../kingfisher/__init__.py", line 46, in download_and_extract download_and_extract_one_run(run, **kwargs) File "/home/mcs/soft/kingfisher/bin/../kingfisher/__init__.py", line 275, in download_and_extract_one_run result = EnaDownloader().download_with_aspera(run_identifier, '.', File "/home/mcs/soft/kingfisher/bin/../kingfisher/ena.py", line 61, in download_with_aspera ftp_urls = self.get_ftp_download_urls(run_id) File "/home/mcs/soft/kingfisher/bin/../kingfisher/ena.py", line 20, in get_ftp_download_urls text = extern.run("curl --silent '{}'".format(query_url)) File "/home/mcs/miniconda3/envs/kingfisher/lib/python3.9/site-packages/extern/__init__.py", line 41, in run raise ExternCalledProcessError(process, command) extern.ExternCalledProcessError: Command curl --silent 'https://www.ebi.ac.uk/ena/portal/api/filereport?accession=ERR1879685&result=read_run&fields=fastq_ftp' returned non-zero exit status 6. STDERR was: b''STDOUT was: b''

I checked the specific accession id ftp path and did not have any problem manually downloading the mentioned files.

Also is there a way to resume the command operation once a task fails?

Thank you.

ShailNair avatar Apr 29 '22 06:04 ShailNair

@wwood is there a way to resume the command operation once a task fails?

ShailNair avatar May 03 '22 11:05 ShailNair

Hi, there's no standardised way that downloads can be resumed. In most cases it could be implemented, I just haven't gotten around to it yet.

Are you able to reproduce that error that you got around with the manual method? Or was it just something transient?

wwood avatar May 04 '22 03:05 wwood

I could download the said file (ERR1879685) using wget without any error. Maybe it was a network issue. I will re-run Kingfisher with the remaining ENA (including the one that showed the error) and see if I can get the same error. Thanks

ShailNair avatar May 04 '22 07:05 ShailNair

The second time I was able to download the said ENA files, however, came across a similar error with other ENA accessions. I guess it's due to an issue with the network. I had 2000 accession ids which I divided into 4-5 parts and then ran kingfisher.

ShailNair avatar May 11 '22 05:05 ShailNair

OK, thanks for letting me know. It might be possible to retry when network things go astray like you describe, but don't think coding that in would be particularly helpful. I'd instead suggest using multiple methods so that you switch networks when one fails e.g. with -m ena-ftp aws-cp aws-http prefetch. Of course, only ENA (currently) provides the fastq directly and others require the extraction step.

On May 11 2022, at 3:51 pm, Shail @.***> wrote:

The second time I was able to download the said ENA files, however, came across a similar error with other ENA accessions. I guess it's due to an issue with the network. I had 2000 accession ids which I divided into 4-5 parts and then ran kingfisher. — Reply to this email directly, view it on GitHub (https://github.com/wwood/kingfisher-download/issues/17#issuecomment-1123212861), or unsubscribe (https://github.com/notifications/unsubscribe-auth/AAADX5GWQLH67PQD2EU2BN3VJNDEXANCNFSM5UNXEXJQ). You are receiving this because you were mentioned.

wwood avatar May 12 '22 00:05 wwood