openfortivpn icon indicating copy to clipboard operation
openfortivpn copied to clipboard

Add keepalive feature

Open pgporada opened this issue 4 years ago • 18 comments

I have a test fortigate device that hits the 15 minute timeout just like #485 . The forticlientsslvpn_cli has a keepalive flag and doesn't time me out. Could something similar be implemented in openfortivpn? Let me know if you need any information. There are long periods of time when I won't need to access anything on the fortigate network, but when I do and have to reauth it's a bit annoying.

forticlientsslvpn_cli [--proxy proxyaddress:proxyport] --server vpnserveraddress:vpnport [--proxyuser proxyuser] [--vpnuser vpnuser] [--pkcs12 pkcs12path] [--keepalive]

forticlientvpn

pgporada avatar Mar 12 '20 14:03 pgporada

I've tried FortiClient 4.4.2334 on a Fedora virtual machine. I do see TCP Keep-Alive packets from the Fedora machine to the FortiGate appliance every 27 seconds,. I see these TCP Keep-Alive packets with or without the --keepalive option, so I'm wondering whether the Keep-Alive can be enabled by the FortiGate appliance in addition to the local --keep-alive option. And where do these 27 seconds come from? Wireshark

DimitriPapadopoulos avatar Mar 19 '20 09:03 DimitriPapadopoulos

Now the strange thing is that the FortiGate appliance we connect to in the above example sends the following XML configuration:

<client-config save-password='off' keep-alive='off' auto-connect='off' />

My understanding is that TCP Keep-Alive should be off, at least by default. I cannot explain why it is on by default, with or without the --keep-alive option. And again where do these 27 seconds come from?

DimitriPapadopoulos avatar Mar 19 '20 10:03 DimitriPapadopoulos

This is related to DTLS, probably not relevant here:

<dtls-config heartbeat-interval='10' heartbeat-fail-count='10' heartbeat-idle-timeout='10' client-hello-timeout='10' />

DimitriPapadopoulos avatar Mar 19 '20 10:03 DimitriPapadopoulos

I assume it's our test appliance. I can see a line set keep-alive disable in the configuration. And I can't find the 27 seconds configured anywhere.

mrbaseman avatar Mar 19 '20 13:03 mrbaseman

Yes, that was the test appliance. I've tried FortiClient (the GUI program this time) with a different FortiGate appliance and the result is identical:

  • I see the client sending Keep-Alive packets even though check box Keep connection alive until manually stopped is not checked.
  • These Keep-Alive packets are sent every 27 seconds and sometimes more often as if there were two different timers triggering the Keep-Alive packets, one steadily every 27 seconds (27, 54, 81, 108, 135, 162, 190, 217, 244, 271, 298, 325, 352, 379, 407) and the other every 27 seconds or more as if it were perhaps triggered by lack of other network activity or other events (56, 13, 204, 240, 267, 319, 363, 390, 420):
No.	Time	Source	Destination	Protocol	Length	Info
[...]
50	27.198051411	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60264 → 443 [ACK] Seq=0 Ack=2 Win=63900 Len=0
51	27.198319176	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60264 [ACK] Seq=2 Ack=1 Win=65535 Len=0
[...]
61	54.334160963	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60264 → 443 [ACK] Seq=0 Ack=2 Win=63900 Len=0
62	54.334754810	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60264 [ACK] Seq=2 Ack=1 Win=65535 Len=0
63	56.895783648	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60266 → 443 [ACK] Seq=611 Ack=1591 Win=63900 Len=0
64	56.896222697	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60266 [ACK] Seq=1591 Ack=612 Win=65535 Len=0
[...]
120	81.470236586	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60264 → 443 [ACK] Seq=0 Ack=2 Win=63900 Len=0
121	81.470627393	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60264 [ACK] Seq=2 Ack=1 Win=65535 Len=0
[...]
138	108.606125830	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60264 → 443 [ACK] Seq=0 Ack=2 Win=63900 Len=0
139	108.606596040	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60264 [ACK] Seq=2 Ack=1 Win=65535 Len=0
140	131.134118293	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60266 → 443 [ACK] Seq=1394 Ack=3067 Win=63900 Len=0
141	131.134914919	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60266 [ACK] Seq=3067 Ack=1395 Win=65535 Len=0
[...]
187	135.742189777	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60264 → 443 [ACK] Seq=0 Ack=2 Win=63900 Len=0
188	135.742530076	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60264 [ACK] Seq=2 Ack=1 Win=65535 Len=0
[...]
203	162.878207847	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60264 → 443 [ACK] Seq=0 Ack=2 Win=63900 Len=0
204	162.878590083	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60264 [ACK] Seq=2 Ack=1 Win=65535 Len=0
[...]
235	190.014094925	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60264 → 443 [ACK] Seq=0 Ack=2 Win=63900 Len=0
236	190.014298200	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60264 [ACK] Seq=2 Ack=1 Win=65535 Len=0
237	204.862080607	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60266 → 443 [ACK] Seq=2390 Ack=5065 Win=63900 Len=0
238	204.862927293	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60266 [ACK] Seq=5065 Ack=2391 Win=65535 Len=0
[...]
253	217.150117646	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60264 → 443 [ACK] Seq=0 Ack=2 Win=63900 Len=0
254	217.151022047	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60264 [ACK] Seq=2 Ack=1 Win=65535 Len=0
255	240.702237882	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60266 → 443 [ACK] Seq=2580 Ack=5277 Win=63900 Len=0
256	240.702612014	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60266 [ACK] Seq=5277 Ack=2581 Win=65535 Len=0
257	244.286229098	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60264 → 443 [ACK] Seq=0 Ack=2 Win=63900 Len=0
258	244.286503950	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60264 [ACK] Seq=2 Ack=1 Win=65535 Len=0
259	267.838192945	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60266 → 443 [ACK] Seq=2580 Ack=5277 Win=63900 Len=0
260	267.838596482	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60266 [ACK] Seq=5277 Ack=2581 Win=65535 Len=0
261	271.422154595	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60264 → 443 [ACK] Seq=0 Ack=2 Win=63900 Len=0
262	271.422522007	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60264 [ACK] Seq=2 Ack=1 Win=65535 Len=0
[...]
283	298.558590197	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60264 [ACK] Seq=2 Ack=1 Win=65535 Len=0
284	319.038086595	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60266 → 443 [ACK] Seq=2960 Ack=5571 Win=63900 Len=0
285	319.038577507	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60266 [ACK] Seq=5571 Ack=2961 Win=65535 Len=0
286	325.694114811	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60264 → 443 [ACK] Seq=0 Ack=2 Win=63900 Len=0
287	325.694922200	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60264 [ACK] Seq=2 Ack=1 Win=65535 Len=0
[...]
291	352.830152033	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60264 → 443 [ACK] Seq=0 Ack=2 Win=63900 Len=0
292	352.830383141	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60264 [ACK] Seq=2 Ack=1 Win=65535 Len=0
[...]
294	363.070151742	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60266 → 443 [ACK] Seq=3073 Ack=5571 Win=63900 Len=0
295	363.070538975	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60266 [ACK] Seq=5571 Ack=3074 Win=65535 Len=0
[...]
305	379.966203749	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60264 → 443 [ACK] Seq=0 Ack=2 Win=63900 Len=0
306	379.966574823	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60264 [ACK] Seq=2 Ack=1 Win=65535 Len=0
307	390.206215847	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60266 → 443 [ACK] Seq=3073 Ack=5571 Win=63900 Len=0
308	390.206531685	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60266 [ACK] Seq=5571 Ack=3074 Win=65535 Len=0
[...]
335	407.102174435	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60264 → 443 [ACK] Seq=0 Ack=2 Win=63900 Len=0
336	407.102372969	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60264 [ACK] Seq=2 Ack=1 Win=65535 Len=0
337	420.414209351	10.0.2.15	xxx.xxx.xxx.58	TCP	56	[TCP Keep-Alive] 60266 → 443 [ACK] Seq=3304 Ack=6153 Win=63900 Len=0
338	420.414623860	xxx.xxx.xxx.58	10.0.2.15	TCP	62	[TCP Keep-Alive ACK] 443 → 60266 [ACK] Seq=6153 Ack=3305 Win=65535 Len=0
[...]
  • We receive the same XML configuration from this FortiGate appliance:
<client-config save-password='off' keep-alive='off' auto-connect='off' />

DimitriPapadopoulos avatar Mar 21 '20 15:03 DimitriPapadopoulos

The 27 seconds period is certainly built-in.

DimitriPapadopoulos avatar Mar 21 '20 15:03 DimitriPapadopoulos

Maybe we just have to enable SO_KEEPALIVE and set a few other options like TCP_KEEPINTVL in io_loop

mrbaseman avatar Mar 21 '20 21:03 mrbaseman

here an example how to implement tcp keepalive on the socket

mrbaseman avatar Mar 26 '20 21:03 mrbaseman

@DimitriPapadopoulos could you test my keepalive branch? I'm not sure if it is as simple as it looks to me at the moment, but if it's just setting a few socket options, that would be a feature which is really easy to implement.

mrbaseman avatar Mar 27 '20 13:03 mrbaseman

Not sure how to test it. It "works". But I don't see the same kind of traffic I see with FortiClient in Wireshark.

DimitriPapadopoulos avatar Apr 02 '20 07:04 DimitriPapadopoulos

I was hoping that you would see the same kind of traffic. Hmm... that means it is not that easy...

mrbaseman avatar Apr 02 '20 08:04 mrbaseman

Let me recheck though - I'm not very familiar with Wireshark.

DimitriPapadopoulos avatar Apr 02 '20 09:04 DimitriPapadopoulos

A colleague of mine was complaining about connection losses on Mac OS X, and so I have asked him to test this branch. From his experience it does not show a remarkably different behavior compared to the master branch. Session interruptions can have many causes (e.g. the ISP assigning a new client IP every now and then) and one would have to test over a long period of time, so was his conclusion.

mrbaseman avatar Apr 14 '20 14:04 mrbaseman

further test result: In 3 attempts the connection broke down after about an hour if no traffic went through the tunnel. If a ping is executed in the background, the connection is stable for several hours. This means, the changes in this keepalive-branch are not sufficient to activate the tcp keepalive feature on the socket - or maybe the parameters have to be adjusted somehow.

mrbaseman avatar Apr 16 '20 07:04 mrbaseman

today on Ubuntu with master@33b042a631b7e75b8b7d03651fba0224a43b2a1b:

...
No response to 4 echo-requests
Serial link appears to be disconnected.
Connect time 19.5 minutes.
Sent 125720 bytes, received 543567 bytes.
Connection terminated.
ERROR:  read: Input/output error
INFO:   Cancelling threads...
INFO:   Cleanup, joining threads...
INFO:   Setting ppp0 interface down.
INFO:   Restoring routes...
ERROR:  pppd: The link was terminated because the peer is not responding to echo requests.
INFO:   Terminated pppd.
INFO:   Closed connection to gateway.
INFO:   Logged out.
...

it seems pppd on linux has some built-in keepalive, or where does the first message come from? I haven't seen this in the past, maybe one of the recent commits that improve error handling lets openfortivpn print out this message now.

mrbaseman avatar May 05 '20 11:05 mrbaseman

Is this perhaps related to the lcp-echo* options of pppd? But then has this been enabled recently? We might have to dive into pppd source code and/or versions released with Ubuntu...

To clarify, I don't think this is the result of a recent change in openfortivpn. This error message is from pppd directly:

No response to 4 echo-requests
Serial link appears to be disconnected.

and this one is part of the pppd error reporting code that has been available for some time now:

ERROR:  pppd: The link was terminated because the peer is not responding to echo requests.

So either you have been unlucky or something has changed on your machine. I understand this has been an isolated event, hasn't it?

DimitriPapadopoulos avatar May 05 '20 12:05 DimitriPapadopoulos

That said what you experienced does not look like a keep-palive feature - one that would keep the connection alive in the absence of other traffic. Rather it merely detects whether the connection is alive independently from other traffic.

DimitriPapadopoulos avatar May 05 '20 13:05 DimitriPapadopoulos

Yes, this was a single event caused by hardware maintenance ;) I have seen disconnects in the past, but haven't seen any error messages about the details. Maybe #436 has improved this and now you can guess how seldomly I have observed such interruptions. You are also right, this is not a feature to keep the connection alive in the absence of any traffic - it's rather a feature to detect a dead connection when traffic failts to be transported along the tunnel. So we can remove these posts or mark them as off-topic

mrbaseman avatar May 05 '20 21:05 mrbaseman