requests icon indicating copy to clipboard operation
requests copied to clipboard

A `ConnectionError` ("Read timed out.") is raised instead of `ReadTimeout`, when using `timeout` keyword argument of `Session.get()`

Open nlykkei opened this issue 5 years ago • 11 comments
trafficstars

Consider the code below (main.py). When a temporary network disconnect occurs without the timeout keyword argument to Session.get(), the client may hang indefinitely and no exception is raised.

However, if I use the timeout keyword argument, the application will raise a ConnectionError from models.py corresponding to the urllib3.exceptions.ReadTimeoutError:

requests.exceptions.ConnectionError: HTTPSConnectionPool(host='confluence.example.net', port=443): Read timed out.

Given that the exception is only raised, when using the timeout keyword argument, why isn't Requests raising ReadTimeout exception instead? In particular, the ConnectionError's exception message "Read timed out." suggests that it should be a ReadTimeout exception?

To mitigate the issue I'm currently performing a regular expression match on the exception message, which is a bad practice:

except ConnectionError as e:
     if re.search('Read timed out', str(e), re.IGNORECASE):

main.py:

 try:
     with requests.Session() as rs:
         rs.mount('https://', HTTPAdapter(max_retries=Retry(total=10, connect=10, read=10, backoff_factor=1)))
         with rs.get(url, params={}, headers={}, auth=self.auth, verify=self.ssl_verify, timeout=(30, 30)) as r:
             r.raise_for_status()
             page_set = r.json()
 except ReadTimeout as e:
     logging.exception('Request for page set timed out: {}'.format(url))
     continue
 except ConnectionError as e:
     if re.search('Read timed out', str(e), re.IGNORECASE):
         logging.exception('Request for page set timed out (network problem): {}'.format(url))
         continue
     else:
         raise

models.py: https://github.com/psf/requests/blob/master/requests/models.py

def generate():
            # Special case for urllib3.
            if hasattr(self.raw, 'stream'):
                try:
                    for chunk in self.raw.stream(chunk_size, decode_content=True):
                        yield chunk
                except ProtocolError as e:
                    raise ChunkedEncodingError(e)
                except DecodeError as e:
                    raise ContentDecodingError(e)
                except ReadTimeoutError as e:
                    raise ConnectionError(e)

Exception:

ERROR -- 04/19/2020 04:51:32 PM -- root -- ThreadPoolExecutor-0_0  -- Request for page set timed out (network problem): https://confluence.example.net/rest/api/content/search?expand=version,history,space,body.storage,children.attachment.version,children.attachment.history,children.attachment.space&limit=50&start=1900&cql=(type=page)
Traceback (most recent call last):
  File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/urllib3/response.py", line 425, in _error_catcher
    yield
  File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/urllib3/response.py", line 755, in read_chunked
    chunk = self._handle_chunk(amt)
  File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/urllib3/response.py", line 708, in _handle_chunk
    returned_chunk = self._fp._safe_read(self.chunk_left)
  File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 620, in _safe_read
    chunk = self.fp.read(min(amt, MAXAMOUNT))
  File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/socket.py", line 589, in readinto
    return self._sock.recv_into(b)
  File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 1071, in recv_into
    return self.read(nbytes, buffer)
  File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 929, in read
    return self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/requests/models.py", line 751, in generate
    for chunk in self.raw.stream(chunk_size, decode_content=True):
  File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/urllib3/response.py", line 560, in stream
    for line in self.read_chunked(amt, decode_content=decode_content):
  File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/urllib3/response.py", line 781, in read_chunked
    self._original_response.close()
  File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/contextlib.py", line 130, in __exit__
    self.gen.throw(type, value, traceback)
  File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/urllib3/response.py", line 430, in _error_catcher
    raise ReadTimeoutError(self._pool, None, "Read timed out.")
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='confluence.example.net', port=443): Read timed out.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/nlykkei/projects/atlassian-watchdog/confluence/confluence.py", line 112, in __producer
    with rs.get(url, params={}, headers={}, auth=self.auth, verify=self.ssl_verify, timeout=(30, 30)) as r:
  File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/requests/sessions.py", line 543, in get
    return self.request('GET', url, **kwargs)
  File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/requests/sessions.py", line 530, in request
    resp = self.send(prep, **send_kwargs)
  File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/requests/sessions.py", line 683, in send
    r.content
  File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/requests/models.py", line 829, in content
    self._content = b''.join(self.iter_content(CONTENT_CHUNK_SIZE)) or b''
  File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/requests/models.py", line 758, in generate
    raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='confluence.example.net', port=443): Read timed out.

Expected Result

I would expect a requests.exceptions.ReadTimeout to be raised.

Actual Result

A requests.exceptions.ConnectError was raised instead, with the error message: Read timed out.

System Information

$ python -m requests.help
nlykkei:~$ python3 -m requests.help
{
  "chardet": {
    "version": "3.0.4"
  },
  "cryptography": {
    "version": ""
  },
  "idna": {
    "version": "2.7"
  },
  "implementation": {
    "name": "CPython",
    "version": "3.7.7"
  },
  "platform": {
    "release": "19.4.0",
    "system": "Darwin"
  },
  "pyOpenSSL": {
    "openssl_version": "",
    "version": null
  },
  "requests": {
    "version": "2.23.0"
  },
  "system_ssl": {
    "version": "1010106f"
  },
  "urllib3": {
    "version": "1.24.3"
  },
  "using_pyopenssl": false
}

This command is only available on Requests v2.16.4 and greater. Otherwise, please provide some basic information about your system (Python version, operating system, &c).

nlykkei avatar Apr 19 '20 15:04 nlykkei

I'd be happy to try and make a pull request on this, but no one has said if that's just how it's supposed to work or not, so I'll wait until then

wxllow avatar Jun 11 '20 12:06 wxllow

This is behaving exactly as it unfortunately must until a breaking API change can be introduced. The behaviour was likely (I don't remember this specifically) introduced to raise the same exception as it used to raise before urllib3 introduced finer-grained exceptions. Is that great? No. Is it what will happen until a real requests 3.0 can happen? Almost certainly due to backwards compatibility concerns and API stability.

sigmavirus24 avatar Jun 11 '20 13:06 sigmavirus24

@simonvanderveldt So you're complaining about exception wrapping which is a user feature to allow them to not have to think about handling exceptions from N libraries being used inside of Requests? That's tangential to this issue. Please let's not muddy the water with this conversation

sigmavirus24 avatar Jul 30 '20 15:07 sigmavirus24

I am following this issue as I also get mixed results when a timeout is raised.

My timeout is set to (1,10) connect/read and when it times out, I can see any of these 2:

HTTPSConnectionPool(host='cloud-collector.newrelic.com', port=443): Read timed out. (read timeout=10)

HTTPSConnectionPool(host='cloud-collector.newrelic.com', port=443): Read timed out. (read timeout=1)

Screenshot 2020-07-30 at 16 36 08

I imagine the second one should mention connect instead of read, possibly making one believe that the timeouts are not being applied correctly.

I am still trying to narrow down the problem before creating a new github issue, but it seems something isn't too clear in the way exceptions are handled.

Edit: I'm not entirely sure this is the same issue so I created https://github.com/psf/requests/issues/5544

stephanebruckert avatar Jul 30 '20 15:07 stephanebruckert

try to enable the OS TCP/IP stack tcp keep alive, it is disable by default , ie. for Linux I use this inorder to keep the connection up...

import socket
from urllib3.connection import HTTPConnection
HTTPConnection.default_socket_options = HTTPConnection.default_socket_options + [
    (socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
    ,(socket.SOL_TCP, socket.TCP_KEEPIDLE, 45)
    ,(socket.SOL_TCP, socket.TCP_KEEPINTVL, 10)
    ,(socket.SOL_TCP, socket.TCP_KEEPCNT, 6)
     ]

from requests import session
self.session = session()

If the socket gets dropped because the server didnt get any tcp keep alive msg while processing your request, you will get a ConectionTimeout or ConnectionReset eror, because, it is from a lower layer than the data tranfer...so if the socket is close...no data will be tranfer...

jakermx avatar Feb 05 '21 02:02 jakermx

Hello @nlykkei By your doubt I assume you're trying to handle that exception raised by urllib3, if it is, then this is what I have done to handle it. Instead of using except ReadTimeout as e: I used except requests.ReadTimeout as e: and it handled just perfectly. Please let me know if you had the same problem or not and whether you have managed to solve your problem.

YashasviBhatt avatar Mar 10 '21 04:03 YashasviBhatt

Hello @nlykkei By your doubt I assume you're trying to handle that exception raised by urllib3, if it is, then this is what I have done to handle it. Instead of using except ReadTimeout as e: I used except requests.ReadTimeout as e: and it handled just perfectly. Please let me know if you had the same problem or not and whether you have managed to solve your problem.

Well,your are correct about the Exception handle...but, the real issue, is that the OS libraries, no matter is your request forces to use persistenr connections by enabling theConnection : Keep-Alive header on HTTP/HTTPS Request....it depends on 2 possible scenarios, 1st, the easy one... the server responds back with a connection close or a the enconding is set too chunk-response.... to resolve this just iterate the response till you get Len 0 onthe responseheader...

But At least on my findings, the major issue deepnds on the lower layers, since the content providers dont want to spend resources on "probably" dead connection, they close them before the IETFs RFC standrard,that defines the TCP keep alive behaivor.....and set it to start sendind L4 TCP KA, after 2 hours....so it is insane, and obvously will spend resourses...so what I have done is set the SO_KEEPALIVE flag on, and ser the timers lower....so youcan release the connection to the connection pool, and release resourses on your device, when idle.

I dont like catch exceptions and do retries, when they are not neccesary....

but you can retry on your code, dont set the retry param, because you willgetthe same behaivor, sincethe Lower ayer will retry the High Level Request on the same dead connection....

Cheers from Mexico

jakermx avatar Mar 10 '21 13:03 jakermx

I faced the same issue, by removing the headers worked for me. 20:21 Traceback (most recent call last): File ““, line 1, in File “/opt/domainvalidation/requests/api.py”, line 76, in get return request(‘get’, url, params=params, **kwargs) File “/opt/domainvalidation/requests/api.py”, line 61, in request return session.request(method=method, url=url, **kwargs) File “/opt/domainvalidation/requests/sessions.py”, line 530, in request resp = self.send(prep, **send_kwargs) File “/opt/domainvalidation/requests/sessions.py”, line 643, in send r = adapter.send(request, **kwargs) File “/opt/domainvalidation/requests/adapters.py”, line 529, in send raise ReadTimeout(e, request=request) requests.exceptions.ReadTimeout: HTTPSConnectionPool(host=‘www.123xyz.com’, port=443): Read timed out. (read timeout=15)

response = requests.get(‘https://www.123xyz.com’, timeout=15, allow_redirects=True) response <Response [200]>

Not sure the behavior is specific to the server Note: 123xyz is just for example.

vikasnavgire avatar Jun 16 '21 06:06 vikasnavgire

Same issue in using a custom adapter as well where the underlying urllib3 adapter can return a urllib3.exceptions.ReadTimeoutError as a reason for MaxRetryError but that check is missing so it falls through to a ConnectionError https://github.com/psf/requests/blob/2d2447e210cf0b9e8c7484bfc6f158de9b24c171/requests/adapters.py#L501-L517

chiragjn avatar Nov 29 '21 07:11 chiragjn

Please have a look at https://github.com/psf/requests/issues/4590 as the issue may be in urllib3. If not exactly this issue a very similar one. Thanks.

stefano-xy avatar Nov 29 '21 17:11 stefano-xy