openmptcprouter icon indicating copy to clipboard operation
openmptcprouter copied to clipboard

Glorytun TCP Dropout Starlink (CGNAT)

Open lukeiggle opened this issue 2 years ago • 39 comments

Expected Behavior

I have 3 x WAN Connections. Starlink / LTE / ADSL

Glorytun TCP should have connections bound to all 3 x WAN interfaces.

On first restart of Glorytun TCP we can see all 3 x TCP Sessions established:

netstat -an | grep 65001 tcp6 0 0 :::65001 :::* LISTEN tcp6 0 0 VPS:65001 STARLINK-OUTSIDE-IP:42458 ESTABLISHED tcp6 0 0 VPS:65001 LTE-DIRECT-IP:21573 ESTABLISHED tcp6 0 0 VPS:65001 ADSL-OUTSIDE-IP:50882 ESTABLISHED

Current Behavior

Minutes after a restart we see the connection for the Starlink service has dropped out and never re-establishes: netstat -an | grep 65001 tcp6 0 0 :::65001 :::* LISTEN tcp6 0 0 VPS:65001 LTE-DIRECT-IP:21573 ESTABLISHED tcp6 0 0 VPS:65001 ADSL-OUTSIDE-IP:50882 ESTABLISHED

Specifications

  • OpenMPTCProuter version: openmptcprouter v0.59beta6-5.4 r0+16594-ce9
  • OpenMPTCProuter VPS version: 5.4.196-mptcp 0.1027-test
  • OpenMPTCProuter VPS provider: Bare Metal in a DC
  • OpenMPTCProuter platform: VPS on x64 Hypervisor

lukeiggle avatar Jul 07 '22 04:07 lukeiggle

What do you have in router part system log ?

Ysurac avatar Jul 07 '22 06:07 Ysurac

Nothing comes up or logs around the time that the tcp session drops on the cgnat starlink interface.

Occasionally the vpn drops and re-establishes, but this could be because of it just being on the adsl and lte interface.

lukeiggle avatar Jul 07 '22 08:07 lukeiggle

Is there a log somewhere for glorytun tcp that shows connection attempts and disconnects?

How can I tune glorytun to retry TCP connections if they drop out?

lukeiggle avatar Jul 10 '22 23:07 lukeiggle

Interesting, I'm seeing the same on my side w/ Starlink.

There really is nothing of value logged when this happens, unfortunately ... and it happens within a couple minutes.

2022-07-17T08:55:27-05:00 OpenMPTCProuter glorytun: starting glorytun vpn instance vpn
2022-07-17T08:55:27-05:00 OpenMPTCProuter glorytun[14131]: INITIALIZED tun0
2022-07-17T08:55:28-05:00 OpenMPTCProuter glorytun[14131]: VPS-IP.65001: connected
2022-07-17T08:55:28-05:00 OpenMPTCProuter glorytun[14131]: STARTED tun0
2022-07-17T08:55:44-05:00 OpenMPTCProuter glorytun[14131]: STOPPED tun0
2022-07-17T08:55:44-05:00 OpenMPTCProuter glorytun[14131]: VPS-IP.65001: connected
2022-07-17T08:55:45-05:00 OpenMPTCProuter glorytun[14131]: STARTED tun0

bdruth avatar Jul 17 '22 13:07 bdruth

Not sure if it's the same issue. You only have Starlink as connection ? if not what is the connection set as Master ?

You can try an other VPN in System->OpenMPTCProuter, "advanced settings" checkbox like Glorytun UDP.

Ysurac avatar Jul 17 '22 14:07 Ysurac

I have VDSL + Starlink, so similar, just 2 connections instead of 3. Switched over to DSVPN and it took longer, but when I checked back after a couple hours, similar - the Starlink side of the VPN was down. Also, nothing notable in the DSVPN logs ... though, there's a strange error about RTNETLINK that doesn't appear to impact anything?

2022-07-17T09:10:29-05:00 OpenMPTCProuter dsvpn: starting dsvpn vpn instance vpn
2022-07-17T09:10:29-05:00 OpenMPTCProuter dsvpn[1303]: Interface: [tun0]
2022-07-17T09:10:29-05:00 OpenMPTCProuter dsvpn[1303]: Trying to reconnect
2022-07-17T09:10:29-05:00 OpenMPTCProuter dsvpn[1303]: Connecting to VPS-IP:65401...
2022-07-17T09:10:29-05:00 OpenMPTCProuter dsvpn[1303]: RTNETLINK answers: Permission denied
2022-07-17T09:10:29-05:00 OpenMPTCProuter dsvpn[1303]: Connected
2022-07-17T12:24:52-05:00 OpenMPTCProuter dsvpn[1303]: Done.
2022-07-17T12:24:54-05:00 OpenMPTCProuter dsvpn: starting dsvpn vpn instance vpn
2022-07-17T12:24:54-05:00 OpenMPTCProuter dsvpn[3213]: Interface: [tun0]
2022-07-17T12:24:54-05:00 OpenMPTCProuter dsvpn[3213]: Trying to reconnect
2022-07-17T12:24:54-05:00 OpenMPTCProuter dsvpn[3213]: Connecting to VPS-IP:65401...
2022-07-17T12:24:54-05:00 OpenMPTCProuter dsvpn[3213]: RTNETLINK answers: Permission denied
2022-07-17T12:24:54-05:00 OpenMPTCProuter dsvpn[3213]: Connected
2022-07-17T12:24:54-05:00 OpenMPTCProuter dsvpn[3213]: Done.
2022-07-17T12:24:56-05:00 OpenMPTCProuter dsvpn: starting dsvpn vpn instance vpn
2022-07-17T12:24:56-05:00 OpenMPTCProuter dsvpn[3942]: Interface: [tun0]
2022-07-17T12:24:56-05:00 OpenMPTCProuter dsvpn[3942]: Trying to reconnect
2022-07-17T12:24:56-05:00 OpenMPTCProuter dsvpn[3942]: Connecting to VPS-IP:65401...
2022-07-17T12:24:56-05:00 OpenMPTCProuter dsvpn[3942]: RTNETLINK answers: Permission denied
2022-07-17T12:24:56-05:00 OpenMPTCProuter dsvpn[3942]: Connected

Somewhere betw 9:10am and when I checked around 12:20pm is when it dropped the Starlink side.

OpenMPTCProuter# netstat -an | grep 65401
tcp        0      0 VDSL-IP:48669     VPS-IP:65401      ESTABLISHED
tcp        0      0 Starlink-IP:14394    VPS-IP:65401      ESTABLISHED

bdruth avatar Jul 17 '22 17:07 bdruth

Few hours later, DSVPN is still maintaining both connections ...

OpenMPTCProuter# netstat -an | grep 65401
tcp        0      0 VDSL-IP:48669     VPS-IP:65401      ESTABLISHED
tcp        0      0 Starlink-IP:14394    VPS-IP:65401      ESTABLISHED

bdruth avatar Jul 17 '22 20:07 bdruth

@bdruth Thanks for responding. I'm trying DSVPN now.

Wondering if you are seeing the same issues I'm seeing on issue #2417 as well?

lukeiggle avatar Jul 17 '22 23:07 lukeiggle

Perhaps 30 minutes later DSVPN stops on starlink for me as well.

VPS# netstat -an | grep 65401 tcp 0 0 0.0.0.0:65401 0.0.0.0:* LISTEN tcp 0 0 VPS-IP:65401 LTE:19964 ESTABLISHED tcp 0 0 VPS-IP:65401 VDSL:59148 ESTABLISHED

lukeiggle avatar Jul 18 '22 04:07 lukeiggle

I've seen it stay up longer on DSVPN, but it definitely does drop eventually. I'm wondering why it doesn't re-establish once it drops, that's the real question. Overall, my Starlink stays connected quite well (no obstructions, etc.), so it's odd that it's consistently that connection that's dropped and not re-established.

bdruth avatar Jul 18 '22 13:07 bdruth

I don't know that I have the #2417 issue @lukeiggle - at least, not that I'm aware of - I don't see any connections when I use the diagnostic you posted - netstat -an | grep SYN_SENT from my VPS.

bdruth avatar Jul 18 '22 13:07 bdruth

As I said, did you try Glorytun UDP ?

Ysurac avatar Jul 18 '22 13:07 Ysurac

Can you also try to set Network->MPTCP, "Multipath TCP scheduler" to "Default" ? It seems that I have same issue with BLEST but not with default scheduler.

Ysurac avatar Jul 18 '22 17:07 Ysurac

OK - will get that sorted shortly.

bdruth avatar Jul 18 '22 18:07 bdruth

OK, switched to default instead of BLEST - also switched to Glorytun UDP - but I'm seeing this in the logs:

2022-07-18T13:33:13-05:00 OpenMPTCProuter glorytun-udp: starting glorytun-udp vpn instance vpn
2022-07-18T13:33:13-05:00 OpenMPTCProuter glorytun-udp[15897]: running on device tun0 as pid 15897
2022-07-18T13:33:36-05:00 OpenMPTCProuter glorytun-udp: starting glorytun-udp vpn instance vpn
2022-07-18T13:33:36-05:00 OpenMPTCProuter glorytun-udp[20781]: running on device tun0 as pid 20781
2022-07-18T13:33:54-05:00 OpenMPTCProuter glorytun-udp: starting glorytun-udp vpn instance vpn
2022-07-18T13:33:54-05:00 OpenMPTCProuter glorytun-udp[24430]: running on device tun0 as pid 24430
2022-07-18T13:34:40-05:00 OpenMPTCProuter glorytun-udp: starting glorytun-udp vpn instance vpn
2022-07-18T13:34:40-05:00 OpenMPTCProuter glorytun-udp[1278]: running on device tun0 as pid 1278
2022-07-18T13:34:54-05:00 OpenMPTCProuter glorytun-udp: starting glorytun-udp vpn instance vpn
2022-07-18T13:34:54-05:00 OpenMPTCProuter glorytun-udp[4045]: running on device tun0 as pid 4045
2022-07-18T13:35:30-05:00 OpenMPTCProuter glorytun-udp: starting glorytun-udp vpn instance vpn
2022-07-18T13:35:30-05:00 OpenMPTCProuter glorytun-udp[10486]: running on device tun0 as pid 10486
2022-07-18T13:35:44-05:00 OpenMPTCProuter glorytun-udp: starting glorytun-udp vpn instance vpn
2022-07-18T13:35:44-05:00 OpenMPTCProuter glorytun-udp[12892]: running on device tun0 as pid 12892

bdruth avatar Jul 18 '22 18:07 bdruth

It doesn't seem like it's liking Glorytun UDP - image

bdruth avatar Jul 18 '22 18:07 bdruth

Possible Glorytun UDP was not working on 0.57.3 The Multipath Scheduler change is only for TCP VPN like Glorytun TCP.

Ysurac avatar Jul 18 '22 18:07 Ysurac

Switching back to Glorytun TCP and the connection comes up - will see if it keep the Starlink iface w/ the MP scheduler change.

bdruth avatar Jul 18 '22 18:07 bdruth

Possible Glorytun UDP was not working on 0.57.3 The Multipath Scheduler change is only for TCP VPN like Glorytun TCP.

Gotcha - I'm still on 0.57.3, so that makes sense.

bdruth avatar Jul 18 '22 18:07 bdruth

Still up - been more than a couple minutes, the BLEST MP scheduler change might be the trick.

bdruth avatar Jul 18 '22 19:07 bdruth

Still up, ~3h later ...

bdruth avatar Jul 18 '22 21:07 bdruth

I changed to DEFAULT on the MP Scheduler (I had BLEST), switched to glorytun TCP.

Did a speedtest.net - and got one of my fastest speed tests ever. Checked glorytun-tcp and it had dropped out within a few minutes.. perhaps the speed test exacerbated it?

Switched to DSVPN, ran the speedtest a couple of times. Still up, will report back.

lukeiggle avatar Jul 18 '22 21:07 lukeiggle

Interesting - I ran a speedtest just to replicate and mine's still up ... 🤷‍♂️

bdruth avatar Jul 18 '22 21:07 bdruth

Glorytun-tcp still drops. for the record, glorytun-udp does the same on starlink.

My best guess is due to the nature of starlink and hopping between satellites the CGNAT binding times out or gets a TCP-RESET and kills the connection. Then Glorytun-tcp then doesn't try and reconnect. Is there something we can parse to glorytun-tcp to keepalive or reconnect sessions without sending a HUP and tearing down the VPN and interrupting traffic flow?

So much traffic like voice/teams etc relies on UDP.. to have the main connection out of 3 down for the tunnel makes this almost unusable.

lukeiggle avatar Jul 19 '22 00:07 lukeiggle

One of my connections is starlink and it seems stable enough. v0.58.5. I don't have any expert insight but I'd be happy to compare settings or try things out if it can help debug issues.

GrantSparks avatar Jul 19 '22 01:07 GrantSparks

With default scheduler I don't have the issue. I also have a starlink connection. When the connection is down for any reason, the subflow is removed and added back. For the drop during speedtest you have to set SQM with download and upload speed.

Ysurac avatar Jul 19 '22 05:07 Ysurac

@Ysurac given the fluctuating nature of starlink, what would speeds you recommend setting sqm for this interface?

Is there any log for tcp drops on glorytun?

are you running the latest beta build like me as well?

lukeiggle avatar Jul 19 '22 07:07 lukeiggle

@lukeiggle On latest beta/snapshot, there is an autorate function for SQM, so you set max download and upload speed and check this box. There is no detailed logs for glorytun. I'm running latest snapshot build (and latest release).

Ysurac avatar Jul 19 '22 07:07 Ysurac

I'm on 0.57.3, so not latest betas here (maybe I should upgrade?) - w/ default scheduler I do still get the Starlink connection dropped and not re-established on Glorytun-TCP ... it just seems to take longer than a couple minutes.

bdruth avatar Jul 19 '22 14:07 bdruth

Yes, I don't support old release. But at least this may be a change in MPTCP patch between this release on default scheduler that I may apply to BLEST scheduler.

Ysurac avatar Jul 19 '22 15:07 Ysurac