toolbelt icon indicating copy to clipboard operation
toolbelt copied to clipboard

HostHeaderSSLAdapter doesn't work with proxies+

Open adrianmeraz opened this issue 5 years ago • 3 comments

adrianmeraz avatar Oct 04 '19 02:10 adrianmeraz

I'm trying to use direct IPs instead of domains to avoid requests having to do a lengthy 4-5 second DNS lookup the first time a new requests session encounters a domain.

I currently get an exception when using proxies. Without proxies, it works fine. Below is the exception, as well as code that recreates the exception. The proxy url just needs to be set to a real proxy.

Traceback (most recent call last):
  File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/urllib3/connectionpool.py", line 594, in urlopen
    self._prepare_proxy(conn)
  File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/urllib3/connectionpool.py", line 815, in _prepare_proxy
    conn.connect()
  File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/urllib3/connection.py", line 376, in connect
    _match_hostname(cert, self.assert_hostname or hostname)
  File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/urllib3/connection.py", line 386, in _match_hostname
    match_hostname(cert, asserted_hostname)
  File "/usr/lib/python3.6/ssl.py", line 327, in match_hostname
    % (hostname, ', '.join(map(repr, dnsnames))))
ssl.CertificateError: hostname '93.184.216.34' doesn't match either of 'www.example.org', 'example.com', 'example.edu', 'example.net', 'example.org', 'www.example.com', 'www.example.edu', 'www.example.net'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/urllib3/connectionpool.py", line 638, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/urllib3/util/retry.py", line 398, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='93.184.216.34', port=443): Max retries exceeded with url: / (Caused by SSLError(CertificateError("hostname '93.184.216.34' doesn't match either of 'www.example.org', 'example.com', 'example.edu', 'example.net', 'example.org', 'www.example.com', 'www.example.edu', 'www.example.net'",),))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "<input>", line 34, in <module>
  File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/requests/sessions.py", line 546, in get
    return self.request('GET', url, **kwargs)
  File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/requests_toolbelt/adapters/host_header_ssl.py", line 43, in send
    return super(HostHeaderSSLAdapter, self).send(request, **kwargs)
  File "/home/user123/PycharmProjects/pyrebotserver/venv/lib/python3.6/site-packages/requests/adapters.py", line 514, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='93.184.216.34', port=443): Max retries exceeded with url: / (Caused by SSLError(CertificateError("hostname '93.184.216.34' doesn't match either of 'www.example.org', 'example.com', 'example.edu', 'example.net', 'example.org', 'www.example.com', 'www.example.edu', 'www.example.net'",),))

Below is code that recreates the issue.

import socket
from urllib.parse import urlparse

import requests
from requests_toolbelt.adapters import host_header_ssl

proxy = 'user:[email protected]:222'

cache = dict()

with requests.Session() as ss:
    ss.mount('https://', host_header_ssl.HostHeaderSSLAdapter())
    proxies = {
        'http': 'http://{}/'.format(proxy),
        'https': 'https://{}/'.format(proxy),
    }

    url = 'https://www.example.com'
    parsed = urlparse(url)
    host = parsed.hostname
    scheme = parsed.scheme

    ss.headers['Host'] = host
    print(ss.headers)
    try:
        ip = cache[host]
    except KeyError:  # Cache miss - do dns lookup
        ip = socket.gethostbyname(host)
        print('DNS Lookup. Domain "{}" / IP "{}"'.format(host, ip))
        cache[host] = ip

    ip_url = '{s}://{ip}'.format(s=scheme, ip=ip)
    print('ip_url: {}'.format(ip_url))
    print(ss.get(ip_url, proxies=proxies).text) # Comment and uncomment to test
    # print(ss.get(ip_url).text)

adrianmeraz avatar Oct 04 '19 21:10 adrianmeraz

Ops, No one Fix this Bug

kobe-tian avatar Aug 21 '20 06:08 kobe-tian

"Workaround" is to use verify=False and then this adapter isn't even needed really.

I know certificate validation is seen as important, but it doesn't really effect encryption, but more validates that ownership of the certificate, detect dns poisioning, etc.

the funny thing is we are hard coding ips to internal ips so we know what server we are talking to and have other layers of protection involved and being able to utilize our own controlled DNS servers has enough advantages where certificate to domain validation doesn't help us.

we are using socks5 for aggressive egress protections, and then routing through VPC networks so we are want to override dns in code.

if I just hack /etc/hosts and set the server name = 10.x.x.x then this works just fine. cURL for instance has a way to sort of define a /etc/hosts section while you are connecting. This can be useful from CD/CI scenarios where you want to hit individual servers that are behind proxies and using bastions to locked down envs where a VPNs and routing starts to become an issue. The problem is hacking /etc/hosts isn't programmatic enough.

Overall advanced usage of proxies, SNI, etc. when you are doing super secure environments these things actually come up and server certificate validation against DNS isn't even good enough or can be done in a more controlled way. I'm not just trying to hit api.someserver.com but instead some internal infrastructure where HA and other things complicate typical "client" scenarios.

Regardless wouldn't mind a way in requests to have standard ways to override validation vs host name where I know i'm using managed DNS, but the cert is something else. verify=False often is just where we have to fallback on because we don't have the control. something like requests.get('https://10.10.10.10/something.html', verify_against='theactualdomain.com')

Sorry for the long comment.

twiggy avatar Oct 12 '22 22:10 twiggy