nat-tunnel icon indicating copy to clipboard operation
nat-tunnel copied to clipboard

Client idles and stops serving requests

Open Saturnix opened this issue 4 years ago • 18 comments

(Thanks for making this awesome piece of software!)

I have a little problem: I run my media server on Windows, but I'm behind NAT. I also have a Linux VPS, which I use to expose my local machine thanks to your script.

Linux runs in server mode, Windows runs in client mode.

The way I run the client is simply opening a cmd.exe tab, and run the script as in the example in the docs

Python27\python.exe natsrv.py --mode client --secret pass123 --local 192.168.1.117:8096 --admin example.com:80

However, if I leave cmd.exe open for a while, example.com:80 eventually stops serving requests. There are no error messages or anything.

I know the error is on the client because if I simply close cmd.exe, reopen it and rereun the client it starts working again.

This happens even after just a few minutes of inactivity (less than 20mins).

Have any idea what might be causing this?

Many thanks again!

Saturnix avatar Mar 04 '20 02:03 Saturnix

thanks for your report. have you tried running the client on a linux machine already? i wonder whether this is a bug in the windows version of python. trying python3 might also be an option.

rofl0r avatar Mar 15 '20 15:03 rofl0r

Hi, thanks for the answer. Still have to try running the client on Linux: will try and let you know here.

Python 3 doesn’t work, I thought it wasn’t supported so I downgraded to 2.7. Will post you the precise error I get on Python 3 on a separate issue asap.

EDIT: done. Opened a separate issue for Win.

Saturnix avatar Mar 15 '20 16:03 Saturnix

@rofl0r Same issue for me. I am running both client and server on linux.

meetcshah19 avatar Oct 06 '20 20:10 meetcshah19

@rofl0r Same issue for me. I am running both client and server on linux.

sigh. maybe using python for this project was a bad choice. anyway, maybe you can help debugging to find out what's wrong. for example does netstat output give some hint about the state of connections ?

rofl0r avatar Oct 06 '20 22:10 rofl0r

Yup thats what I have been trying. It looks like once the socket connection between the client and the server breaks (maybe due to unstable internet) it is unable to establish the connection again.

meetcshah19 avatar Oct 06 '20 23:10 meetcshah19

the way the program currently works is that 0) establishing control channel connection

  1. it establishes an idle data connection to the server
  2. waits for a request
  3. as soon as request is served a new conn is intantiated (i.e. goto 1) this is done so there's no latency for establishing the conn after client connect. can you figure out whether the control connection or the data connection gets interrupted?

rofl0r avatar Oct 07 '20 00:10 rofl0r

It seems like the data connection is getting interrupted.

meetcshah19 avatar Oct 07 '20 06:10 meetcshah19

Same issue here between two enterprise ethernet connections (MIT local and AWS lightsail remote) so I doubt it's anything to do with the actual internet connection. Maybe some sort of keep-alive isn't being properly set?

ariririos avatar Mar 10 '21 04:03 ariririos

that's possible. maybe the "prepare a connection in advance" thing wasn't so smart after all. do you feel capable of changing the code so the data connection is only done after a client connects?

rofl0r avatar Mar 10 '21 11:03 rofl0r

that's possible. maybe the "prepare a connection in advance" thing wasn't so smart after all. do you feel capable of changing the code so the data connection is only done after a client connects?

Yes I think so! I'll take a look this weekend.

ariririos avatar Mar 12 '21 12:03 ariririos

that's possible. maybe the "prepare a connection in advance" thing wasn't so smart after all. do you feel capable of changing the code so the data connection is only done after a client connects?

The socket code here is pretty far above what I'm familiar with, and I don't want to break anything, so I don't think I'll be able to make the necessary edits to resolve this issue. Sorry about this!

ariririos avatar Mar 15 '21 04:03 ariririos

Got a working solution for the problem with client idle connection dropping by using systemd unit and watchdog script.

Client systemd unit (for example in /etc/systemd/system/nat-tunnel-01.service)

[Unit]
Description=nat-tunnel for ssh access
After=network.target

[Service]
Type=simple
WatchdogSec=20
NotifyAccess=all
Restart=always
RestartSec=60
Environment=WATCHDOG_USEC=2000000
ExecStart=/usr/bin/python3 /root/nat-tunnel/natsrv.py --mode client --secret $MYSECRET --local localhost:22 --admin $MYVPSHOST:ADMINPORT
ExecStartPost=/root/nat-tunnel/watchdog.sh

[Install]
WantedBy=multi-user.target

Similar could be used for server side but without parameters for watchdog script (ExecStartPost, WatchdogSec, Environment, NotifyAccess).

watchdog.sh contents

#!/usr/bin/env bash

# src: https://www.medo64.com/2019/01/systemd-watchdog-for-any-service/

watchdog() {
    while(true); do

        # src: https://stackoverflow.com/a/19866239
        TIMEOUT=`timeout 1 bash -c 'cat < /dev/null > /dev/tcp/$MYVPSHOST/$PUBLICPORT'`
        if [ "$?" -eq 0 ]; then
            #echo yeah
            /bin/systemd-notify WATCHDOG=1;
            sleep $(($WATCHDOG_USEC / 2000000))
        else
            #echo no
            sleep 1
        fi
    done
}

watchdog &

Now it works reliably for me, on restart or any other occasion - the service is always available in my case. Maybe this could be somewhere in docs / readme for this project.

schtritoff avatar Nov 17 '21 23:11 schtritoff

Maybe this could be somewhere in docs / readme for this project.

your solution is not universal - it depends on systemd, which i despise. i'd rather find out what's going wrong and fix the bug than documenting bug workarounds.

rofl0r avatar Nov 18 '21 03:11 rofl0r

Regarding SystemD workarround - works for me but not for everyone, I agree. Most popular linux distros have systemd out of the box so this comment might be of value for some users.

On topic - it could be that some router (local gateway or ISP) is dropping/closing connections because there is no activity. I didn't test local (same subnet) client-server variant. If it work in the same subnet it could be that some 3rd party is closing inactive connections. Maybe some heartbeat data flow is needed to keep connection alive.

schtritoff avatar Nov 24 '21 12:11 schtritoff

Maybe some heartbeat data flow is needed to keep connection alive.

this could only mitigate the problem to some degree. the fundamental issue here is that an existing connection is rendered infunctional (disconnected?) but the code fails to detect that event properly.

having an strace dump available for when this happens would help a lot in figuring out what happens effectively.

rofl0r avatar Nov 25 '21 15:11 rofl0r

I was having the same same issue while serving from a Linux machine behind a NAT through a remote AWS Linux machine. Everything would work flawlessly for hours, until something goes wrong with the connection and it all stops working.

I couldn't definitively pin down what was going on, but to me it looked like the ISP or NAT was dropping all connections after I attempted to make lots of requests as required by my server.

I reworked the core part of this script so that it proxies all of the data through a single connection that is maintained between the two machines. Each end then opens/closes connections as needed to keep up the external appearance of connections being maintained between the server and client. The changes for this are in my fork here. The script should otherwise have completely identical CLI arguments and behavior.

This completely fixed the issue for me. It's now been serving for a few weeks without issue on my end.

therealergo avatar Nov 09 '22 17:11 therealergo

@therealergo nice effort. i see you removed the threading part of the code, do you only support a single client being served ?

rofl0r avatar Nov 10 '22 13:11 rofl0r

@rofl0r It supports multiple clients by selecting on the list of client ports and handling available data from any client in a single thread. I don't think that threading adds much with this implementation since it's all going through one connection anyways.

therealergo avatar Nov 10 '22 13:11 therealergo