ncbi-genome-download icon indicating copy to clipboard operation
ncbi-genome-download copied to clipboard

Connection Error, RemoteDisconnected

Open lisavader opened this issue 3 years ago • 15 comments

Hi,

When I try to download data for a relatively large number of genomes, e.g.: ncbi-genome-download bacteria -t 562 -l complete -F assembly-report

I get the following error message: ERROR: Download from NCBI failed: ConnectionError(ProtocolError('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')))

I don't get this issue when downloading only one or a few genomes. Looking at similar issues it seems that a Connection Error is usually due to the connection of users themselves, and not an error caused by ncbi-genome-download. However because the connection is closed by the remote end, I'm not sure.

If anyone could help me out that'd be greatly appreciated!

Best, Lisa

lisavader avatar Apr 08 '21 19:04 lisavader

Same problem (04/10/2021):

ncbi-genome-download --formats fasta bacteria --parallel 4 WARNING: Skipping entry, as it has no ftp directory listed: 'GCF_011742285.2' WARNING: Skipping entry, as it has no ftp directory listed: 'GCF_017815795.1' WARNING: Skipping entry, as it has no ftp directory listed: 'GCF_017815575.1' WARNING: Skipping entry, as it has no ftp directory listed: 'GCF_009498175.3' WARNING: Skipping entry, as it has no ftp directory listed: 'GCF_017815655.1' WARNING: Skipping entry, as it has no ftp directory listed: 'GCF_017815675.1' WARNING: Skipping entry, as it has no ftp directory listed: 'GCF_017815835.1' WARNING: Skipping entry, as it has no ftp directory listed: 'GCF_017869345.1' WARNING: Skipping entry, as it has no ftp directory listed: 'GCF_017815595.1' WARNING: Skipping entry, as it has no ftp directory listed: 'GCF_017815615.1' ERROR: Download from NCBI failed: ConnectionError(ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer')))

ilasadar avatar Apr 10 '21 22:04 ilasadar

Similar problem ERROR: Download from NCBI failed: ConnectionError(ProtocolError('Connection aborted.', OSError(0, 'Error')))

Daikuang avatar Apr 12 '21 02:04 Daikuang

I've also been having the same issue for a week:

ERROR: Download from NCBI failed: ConnectionError(ProtocolError('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')))

tantony3 avatar Apr 12 '21 12:04 tantony3

Hm, it looks like NCBI might have introduced some kind of connection limit. I'm not aware of any documentation on this from the NCBI side of things, and unlike with the Entrez API, there's not really a way to provide e.g. an API key to get a less strict rate limit. I'll try if I can reproduce and debug this a bit further.

kblin avatar Apr 13 '21 06:04 kblin

Ok, looks like I'm getting the ERROR: Download from NCBI failed: ConnectionError(ProtocolError('Connection aborted.', OSError(0, 'Error')),) one myself here. I'll see if I can find out what's happening.

kblin avatar Apr 13 '21 07:04 kblin

Now I got the ERROR: Download from NCBI failed: ConnectionError(ProtocolError('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',)),) one. Unfortunately there's really not much to find out about this, because the connection is closed not at the HTTP GET request level but one level below that, so there's really no communication of what the issue is. I'm currently trying to add a rate limiting step to see if that fixes it, but this will slow down things considerably.

kblin avatar Apr 13 '21 07:04 kblin

I met the same bug, and I am looking forward to your solution.

ERROR: Download from NCBI failed: ConnectionError(ProtocolError('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',)),)

wshuai294 avatar Apr 14 '21 06:04 wshuai294

Nope, still happens, even at just 1 request per second, it just takes longer to get there. As this already happens at the stage of downloading the checksum files, you can't even cache these and restart easily, so I'm also struggling to find a good workaround.

kblin avatar Apr 14 '21 13:04 kblin

Having said that, I hear from a couple of colleagues that also other connections to the NCBI FTP servers die with the same issues, regardless of if the HTTPS protocol is being used (like for ncbi-genome-download) or if old-fashioned FTP is being used. So maybe there's just some networking issues at the NCBI side of things at the moment?

kblin avatar Apr 14 '21 13:04 kblin

Thank you for looking into the issue! Let's hope it's only a temporary NCBI connection problem.

lisavader avatar Apr 14 '21 15:04 lisavader

Having said that, I hear from a couple of colleagues that also other connections to the NCBI FTP servers die with the same issues, regardless of if the HTTPS protocol is being used (like for ncbi-genome-download) or if old-fashioned FTP is being used. So maybe there's just some networking issues at the NCBI side of things at the moment?

I can attest to even pure FTP downloads getting cut off more or less randomly, regardless of the protocol used, going into June 9th 2021. It looks like NCBI introduced some sort of arbitrary cutoff for shutting down connections. One would wonder if they can't just communicate with the research community directly on what's needed...

naturepoker avatar Jun 09 '21 22:06 naturepoker

Hi, still an issue today

npsonis avatar Mar 16 '22 11:03 npsonis

This is on the NCBI side of things, though. Not much we can do about this on the client side.

kblin avatar Mar 17 '22 06:03 kblin

Same problem - Then thinking that it would be nice with a resume command - not sure if when we relaunch everything starts from scratch then, but the ability to resume would be perfect then.

evezeyl avatar Mar 27 '22 10:03 evezeyl

ncbi-genome-download doesn't re-download files that are correctly downloaded and current. But in order to check that, it does need to fetch all checksum files again on startup, and if you're downloading a lot of records that can also take a while.

kblin avatar Mar 28 '22 07:03 kblin

Hitting this today

chasemc avatar Nov 08 '22 00:11 chasemc

NLM had a bunch of website issues a couple of days ago maybe also something going on with the FTP ERROR: Download from NCBI failed: ConnectionError(ProtocolError('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')))

chasemc avatar Nov 08 '22 00:11 chasemc

Again, this is an issue on the NCBI side, nothing ncbi-genome-download can do about it.

kblin avatar Nov 08 '22 07:11 kblin