requests
requests copied to clipboard
Session does not consistently respects proxy environment variables.
There are inconsistencies when working with proxy environment variables using Session objects.
When using Session#request or verb methods (get, post, put), those work correctly and use the proxy when they are present.
However Session#send(PreparedRequest) does not respect the variables by default, but does when it handles redirect responses, e.g. 303.
I've written a test class to present this behavior.
Expected Result
All methods in Session to respect the proxy variable.
Actual Result
Session#send(PreparedRequest) does not respect proxy environment variables.
Reproduction Steps
Test case 1 fails with a successful request.
import requests
import pytest
# Copied from requests/tests/utils.py for making this a standalone
import os
import contextlib
@contextlib.contextmanager
def override_environ(**kwargs):
save_env = dict(os.environ)
for key, value in kwargs.items():
if value is None:
del os.environ[key]
else:
os.environ[key] = value
try:
yield
finally:
os.environ.clear()
os.environ.update(save_env)
class TestSession:
def test_respect_proxy_env_on_send(self):
session = requests.Session()
# Does not respect the proxy configuration and ignores the proxy
request = requests.Request(
method='GET', url='https://www.google.com/'
)
with override_environ(https_proxy='http://example.com'):
try:
session.send(request.prepare())
assert False, "The proxy is invalid this request should not be successful."
except requests.exceptions.ProxyError as e:
assert 'Cannot connect to proxy' in str(e)
def test_respect_proxy_env_on_send_with_redirects(self):
session = requests.Session()
# Returns a 303 to www.google.com and respects the proxy
request = requests.Request(
method='GET', url='https://google.com/'
)
with override_environ(https_proxy='http://example.com'):
try:
session.send(request.prepare())
assert False, "The proxy is invalid this request should not be successful."
except requests.exceptions.ProxyError as e:
assert 'Cannot connect to proxy' in str(e)
def test_respect_proxy_env_on_get(self):
session = requests.Session()
with override_environ(https_proxy='http://example.com'):
try:
# No redirects
session.get('https://www.google.com')
assert False, "The proxy is invalid this request should not be successful."
except requests.exceptions.ProxyError as e:
assert 'Cannot connect to proxy' in str(e)
def test_respect_proxy_env_on_request(self):
session = requests.Session()
with override_environ(https_proxy='http://example.com'):
try:
# No redirects
session.request(method='GET', url='https://www.google.com')
assert False, "The proxy is invalid this request should not be successful."
except requests.exceptions.ProxyError as e:
assert 'Cannot connect to proxy' in str(e)
System Information
$ python -m requests.help
{
"chardet": {
"version": "3.0.4"
},
"cryptography": {
"version": ""
},
"idna": {
"version": "2.10"
},
"implementation": {
"name": "CPython",
"version": "3.8.1"
},
"platform": {
"release": "5.9.8-100.fc32.x86_64",
"system": "Linux"
},
"pyOpenSSL": {
"openssl_version": "",
"version": null
},
"requests": {
"version": "2.25.0"
},
"system_ssl": {
"version": "100020ff"
},
"urllib3": {
"version": "1.26.2"
},
"using_pyopenssl": false
}
I believe things will work if you use Session.prepare_request(request) instead of request.prepare().
Doesn't seem to be the case.
def test_respect_proxy_env_on_send_session_prepared_request(self):
session = requests.Session()
# Does not respect the proxy configuration and ignores the proxy
request = requests.Request(
method='GET', url='https://www.google.com/'
)
with override_environ(https_proxy='http://example.com'):
try:
prepared = session.prepare_request(request)
session.send(prepared)
assert False, "The proxy is invalid this request should not be successful."
except requests.exceptions.ProxyError as e:
assert 'Cannot connect to proxy' in str(e)
Result:
_______________________________________________________ TestSession.test_respect_proxy_env_on_send_session_prepared_request ________________________________________________________
self = <test_session.TestSession object at 0x7f0b9d517fd0>
def test_respect_proxy_env_on_send_session_prepared_request(self):
session = requests.Session()
# Does not respect the proxy configuration and ignores the proxy
request = requests.Request(
method='GET', url='https://www.google.com/'
)
with override_environ(https_proxy='http://example.com'):
try:
prepared = session.prepare_request(request)
session.send(prepared)
> assert False, "The proxy is invalid this request should not be successful."
E AssertionError: The proxy is invalid this request should not be successful.
E assert False
Reading the code I know the issue is that there is no call to rebuild_proxies on the send workflow, and that proxies are not set on the request, but rather during request transmission.
Hello,
I would like to add to the topic of this issue. As far as I can see, the hierarchy of proxy-settings applied to the final request is not consistent or at least not as I would expect.
From my understanding, the documentation suggests the following hierarchy of settings (right supersedes left):
environment variables < session settings < method call kwargs
During usage I found that the actual hierarchy implemented is (only for proxy settings):
session settings < environment variables < method call kwargs
Is this intended behavior?
If so, I would suggest to clarify the respective documentation.
The line changed on the PR has the erroneous precedence format you are mentioning, I'll fix that to be consistent.
You mention there might be other code paths using an incorrect one as well, can you point to those? And also the docs in which you saw this precedence mentioned?
After thinking a bit more about it, I guess the problem arises in Session.merge_environment_settings() here. The proxy-settings from environment are merged into proxy-settings from kwargs before session properties are merged.
I first thought the problem can be avoided if line 722 is placed before 707.
Now I see, that the verify-kwarg has the same behavior (although that did not bother me so far), so I wonder how much of it is actually intended.
I know, I should probably rather make a PR with some regression tests to prove my solution. It would take some time though, since I have to get used to the process first.
Concerning the documentation, I read the section Proxies as if the environment configuration is "overridden" by settings in Python code, and the examples shown suggest the hierarchy I indicated above.
👋 we're also encountering this issue in our python library. rebuild_proxies didn't update settings for us, but calling session.proxies.update(requests.utils.getproxies()) did.
Hope this helps other folks running into this issue, and thanks @mateusduboli for your excellent breakdown with tests, this really helped out our investigation 😄
👋 we're also encountering this issue in our python library.
rebuild_proxiesdidn't update settings for us, but callingsession.proxies.update(requests.utils.getproxies())did.Hope this helps other folks running into this issue, and thanks @mateusduboli for your excellent breakdown with tests, this really helped out our investigation 😄
I had the same issue and @hollabaq86's workaround is best I found so far.
I'll add my cents here.
On Windows, the proxy hierarchy is broken too, and since the environment settings also come from the registry, it's nearly impossible to diagnose why the proxies set in the session are not in use. Few users are aware that Requests will fetch settings from the Windows Registry.
In https://github.com/psf/requests/blob/main/requests/sessions.py#L758 we can see that environment settings are merged in the requests settings before the session settings are merged. The merge mechanism is conservative and does not overwrite previous settings, so session settings won't overwrite environment settings.
IMHO, environment settings should be merged with with session settings then with requests settings. With the current mechanism, environment settings should be merged after the session settings.