configurable-http-proxy icon indicating copy to clipboard operation
configurable-http-proxy copied to clipboard

Release 5.0.0, dropping support for old node version - and more?

Open consideRatio opened this issue 1 year ago • 14 comments

I'd like to drop support for node 10 that remains as various software now is being held back, making it a question of time before a security fix is held back as well in an upstream package.

If we drop support and release v5 of this project, is there more things to do as part of the release?

Currently known v5 checklist

  • #384
  • #411

consideRatio avatar Aug 14 '22 09:08 consideRatio

nodejs 10 is the version in ubuntu 20.04, the latest LTS ubuntu until a couple of months ago.

I'm not sure delaying these dependency updates causes any problems. Does it?

minrk avatar Aug 15 '22 09:08 minrk

Oh, nodejs the apt package available for ubuntu 20.04 is indeed still at node 10... Yikes........ I though that was 18.04.

I wanted to avoid needing to stress to get a critical security patch released that required a modern version of some dependency no longer supporting node 10 or end up debugging an issue that ends up stemming from a bug fixed in a version we haven't been able to install etc.

I'm sure delaying will cause trouble if we do it forever, the question is when and if how long we wait.

Is it your understanding that if we release 5.0.0, we can't require nodejs of a version higher than 10? It perhaps is a question that has different answers depending on if we install this package with conda or pip as well I figure - I presume installing it with pip won't make us able to require a modern node version.

consideRatio avatar Aug 15 '22 10:08 consideRatio

Ther aren't that many new developments in this repo, so we could bump to 5.0.0 (and maybe node 14 or 16?) but ensure JupyterHub remains compatible with 4.x in the JupyterHub CI tests?

manics avatar Aug 15 '22 11:08 manics

In my experience, it's more the updates that cause problems than the lack of them. If node-http-proxy needed to drop an engine support, that would be a very different story. A major version bump to get nothing in particular from command-line parsing doesn't make a lot of sense to me as a major user-facing breaking change. If anything, it makes me want to vendor command-line parsing using only the standard library to avoid the dependency.

I'm okay making 5.0 with this. I'm also okay, honestly, never updating commander ever. Forcing updates with tools like dependabot seems to miss much of the whole point of nodejs' isolated envs - if you don't need an update to a dependency, what's the reason to update?

ensure JupyterHub remains compatible with 4.x in the JupyterHub CI tests?

I think JupyterHub's compatible back to CHP ~2.0 or even 1.0. I don't think we'll ~ever accept a patch that breaks JupyterHub, but we can make sure this stays covered.

minrk avatar Aug 15 '22 13:08 minrk

Hmmmm, I note that https://github.com/http-party/node-http-proxy isn't being maintained. Last commit was 2020, and these were the last merged PRs. So it is not tested against node 14 or 16 for example.

image

consideRatio avatar Sep 13 '22 06:09 consideRatio

The organisation owning the repository is active https://github.com/http-party but the maintenance status is unclear:

  • https://github.com/http-party/node-http-proxy/issues/1354
  • https://github.com/http-party/node-http-proxy/issues/1588

manics avatar Sep 17 '22 13:09 manics

We can shift our default to TraefikProxy. What do folks think?

I never managed to finish migrating z2jh to traefik, but that's perhaps because I tried to do too many things at once (I think it was traefik's incomplete support for ACME+etcd at the same time that bogged me down).

Maybe a more incremental approach where we just swap-in traefik for chp 1:1 (this will mean 2 traefiks!) will work better, then we can add things like multiple replicas, and remove the extra traefik in subsequent PRs.

minrk avatar Sep 19 '22 07:09 minrk

One disadvantage of Traefik is it isn't in conda-forge. How difficult would it be to package it?

manics avatar Sep 22 '22 21:09 manics

One disadvantage of Traefik is it isn't in conda-forge

It's possible to package the go binary of traefik for conda-forge (atleast in theory!), but maybe this could be left to the user, as there are multiple ways one could be running traefik. Or it was just about jupyterhub-traefik-proxy?

MridulS avatar Jan 30 '23 16:01 MridulS

The big disadvantage of traefik right now is that culling unauthenticated Binder relies on activity tracking in CHP, which traefik doesn't have. For 'real' JupyterHub, this doesn't matter because jupyterhub-singleuser implements this activity tracking (traefik is one of the main reasons this was added). But If you run unauthenticated Binder with traefik, you have to disable idle culling, which makes it pretty much a no-go.

The long-term fix is to implement a proxy sidecar for user pods to make non-jupyterhub servers jupyterhub-compatible (a standalone, generic jupyterhub-singleuser proxy), like https://github.com/ideonate/jhsingle-native-proxy.

Packaging traefik on conda-forge shouldn't be a huge deal. I've never done a go package there, but I can have a look. There are others to learn from.

minrk avatar Feb 06 '23 09:02 minrk

I'm maintaining a couple of go packages on conda-forge:

  • https://github.com/conda-forge/stern-feedstock/
  • https://github.com/conda-forge/steampipe-feedstock/

Updates to the above packages are currently blocked due to requiring go 1.19, which hasn't yet been built in conda-forge due to some build problems in the go-feedstock: https://github.com/conda-forge/go-feedstock/pulls

Traefik 2 requires 1.19: https://github.com/traefik/traefik/blob/v2.9.6/go.mod#L3

manics avatar Feb 06 '23 09:02 manics

I've been investigating alternatives to http-proxy, and the main candidates are fast-proxy and http2-proxy. fast-proxy appears to be maintained, but is extremely inactive. It doesn't do websockets, so that's a no-go. fast-gateway is a higher-level package that does do websockets, but appears to make websocket and http mutually exclusive on a given path and statically routed, which doesn't fit our case (the configurable part!).

http2-proxy appears to also be mostly unmaintained, without a commit in two years. But it's also tiny and has zero dependencies (two files, of a few hundred lines). We could easily vendor http2-proxy, and lose the dependency altogether (CHP's only dependencies would be argument parsing, logging, and prometheus metrics). But that mostly changes the http proxy implementation from unmaintained by someone else to ~unmaintained by us.

I've also been looking at traefik, and I think we may be able to leverage traefik metrics to get a good-enough version of activity via checking for deltas in one or more router metrics. I don't know how costly this would be for large numbers of routes, but probably comparable to what we already have to do for CHP.

Given the state of things on node, I think we should:

  1. finish up the traefik v2 compatibility update https://github.com/jupyterhub/traefik-proxy/issues/97,
  2. explore proxy activity via metrics in traefik (opt-in because it's only necessary for unauthenticated BinderHub)
  3. drop-in traefik for CHP in z2jh (as suggested above; don't try to merge into a single traefik in one go, which is what killed https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/1162)
  4. ultimately deprecate CHP as unmaintained (like all the other http proxies on node!)

I can also look into updating to http2-proxy. It should be a small change, since the architecture is pretty much the same as http-proxy. I don't know if that will solve the socket leaks or not (I'm guessing this could have been caused by a node-engine update in the image?). But who knows what other bugs/edge-cases we'd be inheriting. If we don't vendor the package, we'll still be out of luck for bugfixes until we vendor a copy. I'd be okay with depending on it until we have a need for a fix, and vendoring at that point.

minrk avatar Feb 06 '23 11:02 minrk

http2-proxy looks simple enough that if we want to keep CHP then vendoring and maintaining it ourselves, including deleting any unneeded functionality, sounds reasonable, whether as a stop gap or longer term.

https://github.com/corridor/configurable-http-proxy is another option, though when I tried it a while back it had problems with the websocket connections used by https://github.com/jupyterhub/jupyter-remote-desktop-proxy/ , and there's a report of a failure with jupyter-vscode-proxy. Have you tested http2-proxy with any of those?

manics avatar Feb 06 '23 12:02 manics

Hi there, I'm the maintainer of https://github.com/corridor/configurable-http-proxy If someone could help me reproduce the issue I am happy to fix and maintain it

AbdealiLoKo avatar Sep 07 '23 03:09 AbdealiLoKo