configurable-http-proxy icon indicating copy to clipboard operation
configurable-http-proxy copied to clipboard

[WIP] If multiple x-forwarded-proto use first

Open manics opened this issue 5 years ago • 7 comments

See

  • https://github.com/tornadoweb/tornado/pull/2162
  • https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/1700#issuecomment-709555566

@consideRatio do you reckon this would work?

manics avatar Oct 15 '20 20:10 manics

Thanks for submitting your first pull request! You are awesome! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please make sure you followed the pull request template, as this will help us review your contribution more quickly. welcome You can meet the other Jovyans by joining our Discourse forum. There is also a intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:

welcome[bot] avatar Oct 15 '20 20:10 welcome[bot]

Tornado inspection

Currently tornado cares first and foremost about the x-scheme header, and secondly about the x-forwarded-proto header, but if the x-forwarded-proto header isn't just a single entry of http or https, it will be silently ignored. This logic can be inspected here.

My conclusion

@manics, I think from this it follows that we must configure CHP to pass only a single value in x-forwarded-proto to be tornado compatible, and that value in my mind, should be the leftmost entry, which is exactly what you do in this PR. So, in my mind, this is correct.

Verification from Min

@minrk can you help us review this change by answering the following questions?

  • Do you agree JupyterHub's tornado should configure itself using the protocol as requested by the client/browser?
  • Do you agree with me and @manics that the leftmost entry in x-forwarded-proto (assuming it is a comma separated list) is representing the client/browser requested protocol? My belief stems from wikipedia of x-forwarded-for and feels a bit shakey as its not x-forwarded-proto specifically.

consideRatio avatar Oct 16 '20 13:10 consideRatio

Ideally this would be handled in JupyterHub since it's the final consumer of the header, but after looking at https://github.com/tornadoweb/tornado/pull/2162 I can't see an obvious way to do that.

Note there are some subtle issues with whether you can trust the header, the Django docs provide a good explanation: https://docs.djangoproject.com/en/3.1/ref/settings/#secure-proxy-ssl-header

However that's separate to this issue where we're deciding which part of the header to use rather than whether to trust it or not.

manics avatar Oct 16 '20 15:10 manics

This is creating a trust issue by discarding information. I'm not sure we want to do it this way. If we want to change what information is used in tornado, we should probably do it in the tornado app.

minrk avatar Oct 22 '20 09:10 minrk

proxy headers are hard because there's an arbitrary number of layers of proxy headers and you have to decide how many layers you want to trust. E.g. when you have:

  • external proxy (e.g. malicious proxy)
  • cloud load balancer
  • ingress-controller
  • chp

we want to make sure to trust forwarding headers from everything but the proxy outside our own application. The most conservative approach is to trust only the last hop, which I think is what tornado does, but what we care about most of the time is what it looks like to the browser, which is the first entry.

Tornado I think lets you actually configure to some degree which things are trusted, but it's hard to do correctly in general.

This is why I think it was a huge mistake by the IETF when they added the Origin header to only include it on some cross-origin requests and not on same-origin requests, because it means we can't reliably use it, as cases where it's missing are not conclusively cases where it should be allowed.

minrk avatar Oct 22 '20 09:10 minrk

I'm not particularly happy with this solution either, ideally there'd be a way to control how Tornado handles the header, or provide an option to pass it through unchanged to JupyterHub.

I assumed that since you couldn't resolve the problem with https://github.com/tornadoweb/tornado/pull/2162 it was unlikely I'd be able to do anything.... If you can see a way to do it on the JupyterHub side I'm more than happy to go along with that.

An alternative might be the django route where you can configure your first proxy to set a header of your choice instead of x-forwarded-proto or x-scheme, though that's more work for a user.

manics avatar Oct 22 '20 20:10 manics

I'm trying to figure out whether JupyterHub.trusted_downstream_ips should provide a solution instead of this: https://discourse.jupyter.org/t/recommended-use-of-x-forwarded-proto-with-nginx/8035/3

manics avatar Feb 23 '21 20:02 manics