configurable-http-proxy icon indicating copy to clipboard operation
configurable-http-proxy copied to clipboard

Large put request fails with connection reset

Open barperez111 opened this issue 4 years ago • 9 comments

Bug description

Hi! i encounter something that feels like a bug. it is related to what is talked about here.

Basically, we run z2jh on eks. when saving large notebooks the correspondent http put request sent by content manager fails with connection reset after =~ 1 min. To check things, we ran a standalone notebook on the same cluster. using it, large files are saved just fine (after something like 2 minutes). that lead me thinking that the issue is chp related, so i added a chp before the stand alone notebook, and the issue appeared (getting connection reset after =~ 1 min ).

I tried setting --timeout and --proxy-timeout params but that didn't help.. log debug level didn't help me either.

any thoughts? are we sure timeout params working well? Ill Appreciate any help whatsoever , thanx!

barperez111 avatar May 07 '21 06:05 barperez111

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively. welcome You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:

welcome[bot] avatar May 07 '21 06:05 welcome[bot]

Any help here please?

barperez111 avatar May 19 '21 06:05 barperez111

Sorry for leaving you hanging. Can you share the logs from CHP and the backend server (ideally with --debug) when this happens and the command-line options you use? It should be something like:

configurable-http-proxy --log-level debug --timeout 300000 --proxy-timeout 300000 # 5 minute timeout for each

There's probably a configuration parameter in node-http-proxy we need to expose. Including the traceback from CHP if there is one would help pin down what's needed, whether it's in the client configuration or server.

minrk avatar May 25 '21 10:05 minrk

Hi! Sorry for the huge delay. I ran with the suggested configuration but there seems to be nothing interesting in the logs (I made sure chp ran in debug mode).

The logs shows that the proxy of content/ api but I couldn't find any error logs..

  1. Maybe i'm missing a parameter that enables error logs?
  2. Can you think of a parameter that is not exposed and may be relevant to the issue?

Thanx a lot!

barperez111 avatar Jun 24 '21 08:06 barperez111

I'm unsure about configuration to solve your issue in CHP, but I wonder what evidence there is that CHP is to blame compared to another part in the network chain.

Anyone that can reproduce this on another k8s cluster setup, such on on GKE or AKS would reduce the chances it is configuration in a AWS component managing incoming traffic.

Perhaps you can write down more details about your entire setup? How is network traffic flowing? The JupyterHub Helm chart will let traffic go from (the autohttps pod running Traefik ->) the proxy pod running CHP -> the user pod when it comes to saving a notebook. Those could be at fault - but then there is components outside control of the JupyterHub Helm chart as well that could be at fault. Do we have a way to pinpoint the issue to what component is causing the trouble?

consideRatio avatar Jun 24 '21 08:06 consideRatio

What caused the suspicion in chp is: when we added a pod (classic notebook separated from the jupyter helm) to our cluster saving large notebooks works (taking more than 2 minutes, but work). when we added chp as a proxy to it, the issue reappeared.

barperez111 avatar Jun 24 '21 09:06 barperez111

Ah that is a great test to pinpoint it to CHP @barperez111! Is it correct then that the network traffic has gone the same paths, but in one case it went directly to a user pod instead through CHP to the user pod - in both situations using the same other network infrastructure?

consideRatio avatar Jun 24 '21 10:06 consideRatio

Yes I believe that is correct.

barperez111 avatar Jun 30 '21 07:06 barperez111

Hi! Sorry for the huge delay. I ran with the suggested configuration but there seems to be nothing interesting in the logs (I made sure chp ran in debug mode).

The logs shows that the proxy of content/ api but I couldn't find any error logs..

  1. Maybe i'm missing a parameter that enables error logs?
  2. Can you think of a parameter that is not exposed and may be relevant to the issue?

Thanx a lot!

Hi @barperez111 I am running on a similar problem while trying to upload files >10MB, I have read the information about modify tornado websockets and body size and memory on jupyter notebook configuration but still facing the same issue for upload large files.

One question, how do you modify the parameters timeout & proxy-timeout for Configurable-http-proxy? I mean this modification was via Jupyterhub config file?

thanks for your help.

oharach1 avatar Aug 09 '21 14:08 oharach1