toolbelt
toolbelt copied to clipboard
HostHeaderSSLAdapter doesn't work with proxies+
I'm trying to use direct IPs instead of domains to avoid requests having to do a lengthy 4-5 second DNS lookup the first time a new requests session encounters a domain.
I currently get an exception when using proxies. Without proxies, it works fine. Below is the exception, as well as code that recreates the exception. The proxy url just needs to be set to a real proxy.
Traceback (most recent call last):
File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/urllib3/connectionpool.py", line 594, in urlopen
self._prepare_proxy(conn)
File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/urllib3/connectionpool.py", line 815, in _prepare_proxy
conn.connect()
File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/urllib3/connection.py", line 376, in connect
_match_hostname(cert, self.assert_hostname or hostname)
File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/urllib3/connection.py", line 386, in _match_hostname
match_hostname(cert, asserted_hostname)
File "/usr/lib/python3.6/ssl.py", line 327, in match_hostname
% (hostname, ', '.join(map(repr, dnsnames))))
ssl.CertificateError: hostname '93.184.216.34' doesn't match either of 'www.example.org', 'example.com', 'example.edu', 'example.net', 'example.org', 'www.example.com', 'www.example.edu', 'www.example.net'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/urllib3/connectionpool.py", line 638, in urlopen
_stacktrace=sys.exc_info()[2])
File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/urllib3/util/retry.py", line 398, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='93.184.216.34', port=443): Max retries exceeded with url: / (Caused by SSLError(CertificateError("hostname '93.184.216.34' doesn't match either of 'www.example.org', 'example.com', 'example.edu', 'example.net', 'example.org', 'www.example.com', 'www.example.edu', 'www.example.net'",),))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<input>", line 34, in <module>
File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/requests/sessions.py", line 546, in get
return self.request('GET', url, **kwargs)
File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/requests_toolbelt/adapters/host_header_ssl.py", line 43, in send
return super(HostHeaderSSLAdapter, self).send(request, **kwargs)
File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/requests/adapters.py", line 514, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='93.184.216.34', port=443): Max retries exceeded with url: / (Caused by SSLError(CertificateError("hostname '93.184.216.34' doesn't match either of 'www.example.org', 'example.com', 'example.edu', 'example.net', 'example.org', 'www.example.com', 'www.example.edu', 'www.example.net'",),))
Below is code that recreates the issue.
import socket
from urllib.parse import urlparse
import requests
from requests_toolbelt.adapters import host_header_ssl
proxy = 'user:[email protected]:222'
cache = dict()
with requests.Session() as ss:
ss.mount('https://', host_header_ssl.HostHeaderSSLAdapter())
proxies = {
'http': 'http://{}/'.format(proxy),
'https': 'https://{}/'.format(proxy),
}
url = 'https://www.example.com'
parsed = urlparse(url)
host = parsed.hostname
scheme = parsed.scheme
ss.headers['Host'] = host
print(ss.headers)
try:
ip = cache[host]
except KeyError: # Cache miss - do dns lookup
ip = socket.gethostbyname(host)
print('DNS Lookup. Domain "{}" / IP "{}"'.format(host, ip))
cache[host] = ip
ip_url = '{s}://{ip}'.format(s=scheme, ip=ip)
print('ip_url: {}'.format(ip_url))
print(ss.get(ip_url, proxies=proxies).text) # Comment and uncomment to test
# print(ss.get(ip_url).text)
Ops, No one Fix this Bug
"Workaround" is to use verify=False and then this adapter isn't even needed really.
I know certificate validation is seen as important, but it doesn't really effect encryption, but more validates that ownership of the certificate, detect dns poisioning, etc.
the funny thing is we are hard coding ips to internal ips so we know what server we are talking to and have other layers of protection involved and being able to utilize our own controlled DNS servers has enough advantages where certificate to domain validation doesn't help us.
we are using socks5 for aggressive egress protections, and then routing through VPC networks so we are want to override dns in code.
if I just hack /etc/hosts and set the server name = 10.x.x.x then this works just fine. cURL for instance has a way to sort of define a /etc/hosts section while you are connecting. This can be useful from CD/CI scenarios where you want to hit individual servers that are behind proxies and using bastions to locked down envs where a VPNs and routing starts to become an issue. The problem is hacking /etc/hosts isn't programmatic enough.
Overall advanced usage of proxies, SNI, etc. when you are doing super secure environments these things actually come up and server certificate validation against DNS isn't even good enough or can be done in a more controlled way. I'm not just trying to hit api.someserver.com but instead some internal infrastructure where HA and other things complicate typical "client" scenarios.
Regardless wouldn't mind a way in requests to have standard ways to override validation vs host name where I know i'm using managed DNS, but the cert is something else. verify=False often is just where we have to fallback on because we don't have the control. something like requests.get('https://10.10.10.10/something.html', verify_against='theactualdomain.com')
Sorry for the long comment.