requests
requests copied to clipboard
A `ConnectionError` ("Read timed out.") is raised instead of `ReadTimeout`, when using `timeout` keyword argument of `Session.get()`
Consider the code below (main.py). When a temporary network disconnect occurs without the timeout keyword argument to Session.get(), the client may hang indefinitely and no exception is raised.
However, if I use the timeout keyword argument, the application will raise a ConnectionError from models.py corresponding to the urllib3.exceptions.ReadTimeoutError:
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='confluence.example.net', port=443): Read timed out.
Given that the exception is only raised, when using the timeout keyword argument, why isn't Requests raising ReadTimeout exception instead? In particular, the ConnectionError's exception message "Read timed out." suggests that it should be a ReadTimeout exception?
To mitigate the issue I'm currently performing a regular expression match on the exception message, which is a bad practice:
except ConnectionError as e:
if re.search('Read timed out', str(e), re.IGNORECASE):
main.py:
try:
with requests.Session() as rs:
rs.mount('https://', HTTPAdapter(max_retries=Retry(total=10, connect=10, read=10, backoff_factor=1)))
with rs.get(url, params={}, headers={}, auth=self.auth, verify=self.ssl_verify, timeout=(30, 30)) as r:
r.raise_for_status()
page_set = r.json()
except ReadTimeout as e:
logging.exception('Request for page set timed out: {}'.format(url))
continue
except ConnectionError as e:
if re.search('Read timed out', str(e), re.IGNORECASE):
logging.exception('Request for page set timed out (network problem): {}'.format(url))
continue
else:
raise
models.py: https://github.com/psf/requests/blob/master/requests/models.py
def generate():
# Special case for urllib3.
if hasattr(self.raw, 'stream'):
try:
for chunk in self.raw.stream(chunk_size, decode_content=True):
yield chunk
except ProtocolError as e:
raise ChunkedEncodingError(e)
except DecodeError as e:
raise ContentDecodingError(e)
except ReadTimeoutError as e:
raise ConnectionError(e)
Exception:
ERROR -- 04/19/2020 04:51:32 PM -- root -- ThreadPoolExecutor-0_0 -- Request for page set timed out (network problem): https://confluence.example.net/rest/api/content/search?expand=version,history,space,body.storage,children.attachment.version,children.attachment.history,children.attachment.space&limit=50&start=1900&cql=(type=page)
Traceback (most recent call last):
File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/urllib3/response.py", line 425, in _error_catcher
yield
File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/urllib3/response.py", line 755, in read_chunked
chunk = self._handle_chunk(amt)
File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/urllib3/response.py", line 708, in _handle_chunk
returned_chunk = self._fp._safe_read(self.chunk_left)
File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 620, in _safe_read
chunk = self.fp.read(min(amt, MAXAMOUNT))
File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/socket.py", line 589, in readinto
return self._sock.recv_into(b)
File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 1071, in recv_into
return self.read(nbytes, buffer)
File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 929, in read
return self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/requests/models.py", line 751, in generate
for chunk in self.raw.stream(chunk_size, decode_content=True):
File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/urllib3/response.py", line 560, in stream
for line in self.read_chunked(amt, decode_content=decode_content):
File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/urllib3/response.py", line 781, in read_chunked
self._original_response.close()
File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/contextlib.py", line 130, in __exit__
self.gen.throw(type, value, traceback)
File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/urllib3/response.py", line 430, in _error_catcher
raise ReadTimeoutError(self._pool, None, "Read timed out.")
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='confluence.example.net', port=443): Read timed out.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/nlykkei/projects/atlassian-watchdog/confluence/confluence.py", line 112, in __producer
with rs.get(url, params={}, headers={}, auth=self.auth, verify=self.ssl_verify, timeout=(30, 30)) as r:
File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/requests/sessions.py", line 543, in get
return self.request('GET', url, **kwargs)
File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/requests/sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/requests/sessions.py", line 683, in send
r.content
File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/requests/models.py", line 829, in content
self._content = b''.join(self.iter_content(CONTENT_CHUNK_SIZE)) or b''
File "/Users/nlykkei/projects/atlassian-watchdog/lib/python3.7/site-packages/requests/models.py", line 758, in generate
raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='confluence.example.net', port=443): Read timed out.
Expected Result
I would expect a requests.exceptions.ReadTimeout to be raised.
Actual Result
A requests.exceptions.ConnectError was raised instead, with the error message: Read timed out.
System Information
$ python -m requests.help
nlykkei:~$ python3 -m requests.help
{
"chardet": {
"version": "3.0.4"
},
"cryptography": {
"version": ""
},
"idna": {
"version": "2.7"
},
"implementation": {
"name": "CPython",
"version": "3.7.7"
},
"platform": {
"release": "19.4.0",
"system": "Darwin"
},
"pyOpenSSL": {
"openssl_version": "",
"version": null
},
"requests": {
"version": "2.23.0"
},
"system_ssl": {
"version": "1010106f"
},
"urllib3": {
"version": "1.24.3"
},
"using_pyopenssl": false
}
This command is only available on Requests v2.16.4 and greater. Otherwise, please provide some basic information about your system (Python version, operating system, &c).
I'd be happy to try and make a pull request on this, but no one has said if that's just how it's supposed to work or not, so I'll wait until then
This is behaving exactly as it unfortunately must until a breaking API change can be introduced. The behaviour was likely (I don't remember this specifically) introduced to raise the same exception as it used to raise before urllib3 introduced finer-grained exceptions. Is that great? No. Is it what will happen until a real requests 3.0 can happen? Almost certainly due to backwards compatibility concerns and API stability.
@simonvanderveldt So you're complaining about exception wrapping which is a user feature to allow them to not have to think about handling exceptions from N libraries being used inside of Requests? That's tangential to this issue. Please let's not muddy the water with this conversation
I am following this issue as I also get mixed results when a timeout is raised.
My timeout is set to (1,10) connect/read and when it times out, I can see any of these 2:
HTTPSConnectionPool(host='cloud-collector.newrelic.com', port=443): Read timed out. (read timeout=10)
HTTPSConnectionPool(host='cloud-collector.newrelic.com', port=443): Read timed out. (read timeout=1)
I imagine the second one should mention connect instead of read, possibly making one believe that the timeouts are not being applied correctly.
I am still trying to narrow down the problem before creating a new github issue, but it seems something isn't too clear in the way exceptions are handled.
Edit: I'm not entirely sure this is the same issue so I created https://github.com/psf/requests/issues/5544
try to enable the OS TCP/IP stack tcp keep alive, it is disable by default , ie. for Linux I use this inorder to keep the connection up...
import socket
from urllib3.connection import HTTPConnection
HTTPConnection.default_socket_options = HTTPConnection.default_socket_options + [
(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
,(socket.SOL_TCP, socket.TCP_KEEPIDLE, 45)
,(socket.SOL_TCP, socket.TCP_KEEPINTVL, 10)
,(socket.SOL_TCP, socket.TCP_KEEPCNT, 6)
]
from requests import session
self.session = session()
If the socket gets dropped because the server didnt get any tcp keep alive msg while processing your request, you will get a ConectionTimeout or ConnectionReset eror, because, it is from a lower layer than the data tranfer...so if the socket is close...no data will be tranfer...
Hello @nlykkei By your doubt I assume you're trying to handle that exception raised by urllib3, if it is, then this is what I have done to handle it.
Instead of using except ReadTimeout as e:
I used except requests.ReadTimeout as e: and it handled just perfectly.
Please let me know if you had the same problem or not and whether you have managed to solve your problem.
Hello @nlykkei By your doubt I assume you're trying to handle that exception raised by urllib3, if it is, then this is what I have done to handle it. Instead of using
except ReadTimeout as e:I usedexcept requests.ReadTimeout as e:and it handled just perfectly. Please let me know if you had the same problem or not and whether you have managed to solve your problem.
Well,your are correct about the Exception handle...but, the real issue, is that the OS libraries, no matter is your request forces to use persistenr connections by enabling theConnection : Keep-Alive header on HTTP/HTTPS Request....it depends on 2 possible scenarios, 1st, the easy one... the server responds back with a connection close or a the enconding is set too chunk-response.... to resolve this just iterate the response till you get Len 0 onthe responseheader...
But At least on my findings, the major issue deepnds on the lower layers, since the content providers dont want to spend resources on "probably" dead connection, they close them before the IETFs RFC standrard,that defines the TCP keep alive behaivor.....and set it to start sendind L4 TCP KA, after 2 hours....so it is insane, and obvously will spend resourses...so what I have done is set the SO_KEEPALIVE flag on, and ser the timers lower....so youcan release the connection to the connection pool, and release resourses on your device, when idle.
I dont like catch exceptions and do retries, when they are not neccesary....
but you can retry on your code, dont set the retry param, because you willgetthe same behaivor, sincethe Lower ayer will retry the High Level Request on the same dead connection....
Cheers from Mexico
I faced the same issue, by removing the headers worked for me.
response = requests.get(‘https://www.123xyz.com’, timeout=15, allow_redirects=True) response <Response [200]>
Not sure the behavior is specific to the server Note: 123xyz is just for example.
Same issue in using a custom adapter as well where the underlying urllib3 adapter can return a urllib3.exceptions.ReadTimeoutError as a reason for MaxRetryError but that check is missing so it falls through to a ConnectionError
https://github.com/psf/requests/blob/2d2447e210cf0b9e8c7484bfc6f158de9b24c171/requests/adapters.py#L501-L517
Please have a look at https://github.com/psf/requests/issues/4590 as the issue may be in urllib3. If not exactly this issue a very similar one. Thanks.