oauthenticator
oauthenticator copied to clipboard
Configuring web-requests to use a proxy
I think there is something preventing CurlAsyncHTTPClient from accepting defaults.
AsyncHTTPClient.configure("tornado.curl_httpclient.CurlAsyncHTTPClient", defaults=defaults)
didn't work from config.yaml.
I had to do this hack:
extraConfig: |
import pycurl
from tornado.httpclient import HTTPRequest
def configure_proxy(curl):
logging.error(curl.getinfo(pycurl.EFFECTIVE_URL))
# we only want google oauth to use the proxy
if "google" in curl.getinfo(pycurl.EFFECTIVE_URL):
logging.error("adding proxy")
curl.setopt(pycurl.PROXY, "proxy.example.com")
curl.setopt(pycurl.PROXYPORT, 8080)
# never do this
HTTPRequest._DEFAULTS['prepare_curl_callback'] = configure_proxy
I don't know why this doesn't work:
import certifi
from tornado.httpclient import AsyncHTTPClient
defaults2 = dict(ca_certs=certifi.where())
defaults2['proxy_host'] = 'proxy.example.com'
defaults2['proxy_port'] = 8080
AsyncHTTPClient.configure("tornado.curl_httpclient.CurlAsyncHTTPClient", defaults=defaults2)
Actually, I do have one guess. Maybe this overwrite the defaults?
https://github.com/jupyterhub/jupyterhub/blob/8437f47f361aab42d11801703145ababa7372538/jupyterhub/app.py#L1622
I'm having the same issue with Azure AD. Your hack worked for me as well. It would be nice to have this fixed though.
os.environ['PYCURL_SSL_LIBRARY'] = 'nss'
subprocess.call([sys.executable, '-m', 'pip', 'install', '--compile', '--proxy', 'http://www.xxx.yyy.zzz:3128', 'pycurl'])
import pycurl
#defaults = {'proxy_host':'www.xxx.yyy.zzz', 'proxy_port':3128, 'request_timeout':300, 'connect_timeout':60}
#AsyncHTTPClient.configure("tornado.curl_httpclient.CurlAsyncHTTPClient")
def configure_proxy(curl):
logging.error(curl.getinfo(pycurl.EFFECTIVE_URL))
# we only want google oauth to use the proxy
if "microsoftonline" in curl.getinfo(pycurl.EFFECTIVE_URL):
logging.error("adding proxy")
curl.setopt(pycurl.PROXY, "www.xxx.yyy.zzz")
curl.setopt(pycurl.PROXYPORT, 3128)
# never do this
HTTPRequest._DEFAULTS['prepare_curl_callback'] = configure_proxy
Hmmm this is quite advanced and I'm not following things so well. There is an issue I think may be related, perhaps you could have a look at that issue @dtandersen @zneudl ?
This issue regards the use of the google oauthenticator from a specific JupyterHub deployment (Zero-to-jupyterhub-k8s, the helm chart): https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/1185
Regarding this issue, I'd love to learn more about the background that I doesn't grasp.
Another input is that I think the extraConfig will load and execute after the initial jupyterhub_config.py has executed, as provided part of the z2jh jupyterhub dockerimage.
Related: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/blob/69ec17c75c950ab11cd09a1315a7f2e93140811f/images/hub/jupyterhub_config.py#L11-L15
Related: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/blob/69ec17c75c950ab11cd09a1315a7f2e93140811f/images/hub/jupyterhub_config.py#L467-L469
Hi. Would be so cool if everybody simply respected the "http_proxy" environment variables...
Here is my hack to make the hub happy with our proxy (our Gitlab instance is behind our proxy). It also support no_proxy with "*", so that we can have finer proxy tuning:
hub:
extraEnv:
GITLAB_HOST: "http://external.gitlab.server"
http_proxy: http://internalproxy.ourcompany.fr:123
HTTP_PROXY: http://internalproxy.ourcompany.fr:123
https_proxy: http://internalproxy.ourcompany.fr:123
HTTPS_PROXY: http://internalproxy.ourcompany.fr:123
no_proxy: localhost,127.0.0.1,*.ourcompany.fr,10.*,localdomain,cluster.local
NO_PROXY: localhost,127.0.0.1,*.ourcompany.fr,10.*,localdomain,cluster.local
extraConfig: |
# HACK: consume HTTP?_PROXY and NO_PROXY environment variables
# so Hub can connect to external Gitlab.
# https://github.com/jupyterhub/oauthenticator/issues/217
import pycurl
import os
import logging
from tornado.httpclient import HTTPRequest
from urllib.parse import urlparse
from fnmatch import fnmatch
def get_proxies_for_url(url):
http_proxy = os.environ.get("HTTP_PROXY", os.environ.get("http_proxy"))
https_proxy = os.environ.get("HTTPS_PROXY", os.environ.get("https_proxy"))
no_proxy = os.environ.get("NO_PROXY", os.environ.get("no_proxy"))
p = urlparse(url)
netloc = p.netloc
_userpass,_, hostport = p.netloc.rpartition("@")
url_hostname, _, _port = hostport.partition(":")
proxies = {}
if http_proxy:
proxies["http"] = http_proxy
if https_proxy:
proxies["https"] = https_proxy
if no_proxy:
for hostname in no_proxy.split(","):
# Support "*.server.com" and "10.*"
if fnmatch(url_hostname, hostname.strip()):
proxies = {}
break
# Support ".server.com"
elif hostname.strip().replace("*", "").endswith(url_hostname):
proxies = {}
break
# TODO: support network mask: 10.0.0.0/8
return proxies
def configure_proxy(curl):
logging.error("URL: {0}".format(curl.getinfo(pycurl.EFFECTIVE_URL)))
# we only want google oauth to use the proxy
proxies = get_proxies_for_url(curl.getinfo(pycurl.EFFECTIVE_URL))
if proxies:
host, _, port = proxies["https"].rpartition(":")
logging.error("adding proxy: https={0}:{1}".format(host, port))
curl.setopt(pycurl.PROXY, host)
if port:
curl.setopt(pycurl.PROXYPORT, int(port))
# never do this
HTTPRequest._DEFAULTS['prepare_curl_callback'] = configure_proxy
this feels much like hacking...
seems that the issue is with tornado that doesn't respect http_proxy environment variable for CurlAsyncHttpClient: https://github.com/tornadoweb/tornado/issues/754
The conclusion for the tornado issue is:
OK. In that case you must tell tornado that you want to use a proxy by setting the proxy_host and proxy_port arguments.
Even for simple_httpclient i don't know if it supports no_proxy
:(
looks like hacking into pycurl is the only solution for now...
This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:
https://discourse.jupyter.org/t/unable-to-get-git-hub-oauth-work-on-a-jupyterhub-server-which-is-behind-a-proxy/6334/5
This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:
https://discourse.jupyter.org/t/unable-to-get-git-hub-oauth-work-on-a-jupyterhub-server-which-is-behind-a-proxy/6334/6
I think the best option may be to explicitly set proxy_host
for any Tornado requests made by OAuthenticator.
https://www.tornadoweb.org/en/stable/httpclient.html#tornado.httpclient.HTTPRequest
Setting this globally may lead to problems, for example with Z2JH you'd only want to use the proxy for external requests and not for connections to other K8s servers.