openfortivpn icon indicating copy to clipboard operation
openfortivpn copied to clipboard

sudden delay spikes

Open Mr-Philipp opened this issue 5 years ago • 13 comments

i dont know if this is already known:

for me everything is working perfectly (even DNS and routing) but every 30sec i get for about 10sec no response for every packet i send (i get like 10 pings back with 8000 to 20000 ms responsetime.)

this means every RDP session etc. is freezing for at least 10sec every 30sec.

tested this on ubuntu and debian. no problems with the official windows and mac based forticlient.

Mr-Philipp avatar Apr 15 '20 09:04 Mr-Philipp

Hi @Zappelphilipp I am seeing the same behavior (ubuntu 19.10) - occurs on both this client and the proprietary (FortiClient SSLVPN 4.4.2336). It is driving me nuts :confused:

My colleagues working from Windows do not seem to run into it and it seems to work for most of my other colleagues working from Linux as well (we are a few encountering this).

mnsgs avatar Apr 15 '20 09:04 mnsgs

its really weird. I also have to mention that i am using the official repo sources of ubuntu and debian so i am using version 1.10.0-1 BUT i have used openfortivpn on this version already without any issues a few months ago.

Mr-Philipp avatar Apr 15 '20 09:04 Mr-Philipp

@Zappelphilipp So what has changed exactly between the moment it worked a few months ago and the moment you started experiencing these freezes every 30s? I understand the version of openfortivpn has not changed.

Is there any way we can easily reproduce that? For example by pinging a machine behind the VPN server?

By the way it's good to know FortiClient shares the same problem: it probably means this is not an openfortivpn issue :smiley: Perhaps it's specific to VPN SSL (always used by Linux clients) as opposed to VPN IPSec (only used by Windows and macOS clients).

DimitriPapadopoulos avatar Apr 15 '20 09:04 DimitriPapadopoulos

I cannot reproduce that on my Ubuntu 20.04 workstation. Isn't there anything of interest in the system journal? A change in routing, a DHCP lease, other events?

DimitriPapadopoulos avatar Apr 15 '20 10:04 DimitriPapadopoulos

Is there any way we can easily reproduce that? For example by pinging a machine behind the VPN server?

yes, i "debugged" it by pinging a random server behind the firewall/in the network and it looks like this:

64 bytes from 192.168.99.84: icmp_seq=468 ttl=127 time=20.5 ms
64 bytes from 192.168.99.84: icmp_seq=469 ttl=127 time=31.2 ms
64 bytes from 192.168.99.84: icmp_seq=470 ttl=127 time=172 ms
64 bytes from 192.168.99.84: icmp_seq=471 ttl=127 time=25.9 ms
64 bytes from 192.168.99.84: icmp_seq=472 ttl=127 time=32.1 ms
64 bytes from 192.168.99.84: icmp_seq=473 ttl=127 time=21.1 ms
64 bytes from 192.168.99.84: icmp_seq=474 ttl=127 time=19.8 ms
64 bytes from 192.168.99.84: icmp_seq=475 ttl=127 time=21.3 ms
64 bytes from 192.168.99.84: icmp_seq=476 ttl=127 time=19.5 ms
64 bytes from 192.168.99.84: icmp_seq=477 ttl=127 time=24.5 ms
64 bytes from 192.168.99.84: icmp_seq=478 ttl=127 time=43.0 ms
64 bytes from 192.168.99.84: icmp_seq=479 ttl=127 time=190 ms
64 bytes from 192.168.99.84: icmp_seq=480 ttl=127 time=21.7 ms
64 bytes from 192.168.99.84: icmp_seq=481 ttl=127 time=19.9 ms
64 bytes from 192.168.99.84: icmp_seq=482 ttl=127 time=21.9 ms
64 bytes from 192.168.99.84: icmp_seq=483 ttl=127 time=25.8 ms
64 bytes from 192.168.99.84: icmp_seq=484 ttl=127 time=24.0 ms
64 bytes from 192.168.99.84: icmp_seq=485 ttl=127 time=22.1 ms
64 bytes from 192.168.99.84: icmp_seq=486 ttl=127 time=101 ms
64 bytes from 192.168.99.84: icmp_seq=487 ttl=127 time=24.0 ms
64 bytes from 192.168.99.84: icmp_seq=488 ttl=127 time=435 ms
64 bytes from 192.168.99.84: icmp_seq=489 ttl=127 time=22.7 ms
64 bytes from 192.168.99.84: icmp_seq=490 ttl=127 time=30.3 ms
64 bytes from 192.168.99.84: icmp_seq=491 ttl=127 time=159 ms
64 bytes from 192.168.99.84: icmp_seq=492 ttl=127 time=24.4 ms
64 bytes from 192.168.99.84: icmp_seq=493 ttl=127 time=22.4 ms
64 bytes from 192.168.99.84: icmp_seq=494 ttl=127 time=53.4 ms
64 bytes from 192.168.99.84: icmp_seq=495 ttl=127 time=24.9 ms
64 bytes from 192.168.99.84: icmp_seq=496 ttl=127 time=28.7 ms
64 bytes from 192.168.99.84: icmp_seq=497 ttl=127 time=20.2 ms
64 bytes from 192.168.99.84: icmp_seq=498 ttl=127 time=8035 ms
64 bytes from 192.168.99.84: icmp_seq=499 ttl=127 time=7072 ms
64 bytes from 192.168.99.84: icmp_seq=500 ttl=127 time=6048 ms
64 bytes from 192.168.99.84: icmp_seq=501 ttl=127 time=5024 ms
64 bytes from 192.168.99.84: icmp_seq=502 ttl=127 time=4000 ms
64 bytes from 192.168.99.84: icmp_seq=503 ttl=127 time=2980 ms
64 bytes from 192.168.99.84: icmp_seq=504 ttl=127 time=1956 ms
64 bytes from 192.168.99.84: icmp_seq=505 ttl=127 time=932 ms
64 bytes from 192.168.99.84: icmp_seq=506 ttl=127 time=31.3 ms
64 bytes from 192.168.99.84: icmp_seq=507 ttl=127 time=24.9 ms
64 bytes from 192.168.99.84: icmp_seq=508 ttl=127 time=26.5 ms
64 bytes from 192.168.99.84: icmp_seq=509 ttl=127 time=35.9 ms
64 bytes from 192.168.99.84: icmp_seq=510 ttl=127 time=23.4 ms

I am working on Pop!_OS 19.10 x86_64 and on DeepinOS 15.11, same problem. I also working on MacOS Sierra i think and Windows 10 Pro freshly patches. both with the official client and no problems.

i dont know what changed. The firmware of the Fortigate at least did not. are there any logs for openfortivpn?

Mr-Philipp avatar Apr 15 '20 10:04 Mr-Philipp

I suspect openfortivpn logs won't help: the logs will just lag behind like the rest without giving an explanation for the lag. Anyway, openfortivpn can be run in verbose mode by adding multiple -v options (up to 4 if I recall correctly for maximum verbosity). Then pppd logs can be collected using option --pppd-log=.

It would be much more interesting to:

  • have a look at the system log of your local machine,
  • inspect network traffic with Wireshark if you're comfortable enough with that.

What sort of machines do you connect to on the other end? These machines may have changed. See for example: https://superuser.com/questions/1481191/remote-desktop-intermittently-freezing

DimitriPapadopoulos avatar Apr 15 '20 11:04 DimitriPapadopoulos

@Zappelphilipp , for me it does not freeze every 30sec but in a more sporadic approach. We migrated to the service a ~month ago, and for me this has always been an issue. AFAICS, it appears 1) agnostic to the client connecting through, 2) agnostic to the network it connects to and I thus suspect the Fortinet infrastructure (load perhaps?) to be the cause . What puzzles me, is that it appears occurring for only a few Linux users.

I am using gping towards either Linux machines or network switches, which visualizes the issue very well - haven't managed to find any entry in system logs anywhere when issues occurs.

mnsgs avatar Apr 15 '20 11:04 mnsgs

If it's specific to RDP sessions, it could be related to IP fragmentation and MTU or to RDP issues: https://www.google.com/search?q=RDP+IP+fragmentation+MTU

On the other hand if you can reproduce the lags with other software and plain pings, I don't know.

Note that Linux clients are restricted to VPN SSL while Windows and macOS clients often use VPN IPSec, at least by default.

DimitriPapadopoulos avatar Apr 15 '20 11:04 DimitriPapadopoulos

@DimitriPapadopoulos , I use plain ping and ssh - no RDP

mnsgs avatar Apr 15 '20 11:04 mnsgs

It could be that VPN SSL has issues that VPN IPSec does not have on this Fortinet appliance. It would be interesting to compare SSL and IPSec from a Windows machine with a FortiClient capable of both SSL and IPSec.

DimitriPapadopoulos avatar Apr 15 '20 11:04 DimitriPapadopoulos

I could reproduce these spikes on Ubuntu 16.04 with a heavy load on the tunnel. My ping times were quite stable at about 22 ms, but went up to 800-1000 ms when I started to sync large files through scp on the same vpn tunnel. The 30 seconds could be an email client that regularly checks the imap folders, something like that.

mrbaseman avatar Apr 15 '20 12:04 mrbaseman

Interesting, then it would be worth closing mail clients and other usual suspects. If it doesn't help, inspecting the network traffic with Wireshark might give a clue.

DimitriPapadopoulos avatar Apr 16 '20 19:04 DimitriPapadopoulos

We have even noticed that one user can not ping anymore through his vpn tunnel (heavy packet losses) when another user transfers large files through another vpn connection. Maybe the fact that we have configured a software switch (which is not recommended) plays a role here.

mrbaseman avatar Apr 16 '20 20:04 mrbaseman