iperf icon indicating copy to clipboard operation
iperf copied to clipboard

Add control connection keepalive arguments

Open davidBar-On opened this issue 2 years ago • 13 comments

  • Version of iperf3 (or development branch, such as master or 3.1-STABLE) to which this pull request applies: master

  • Issues fixed (if any): #812, #835

  • Brief description of code changes (suitable for use as a commit message):

Adding Control-connection TCP Keepalive arguments: --cntl-ka [<KEEPIDLE>/<KEEPINTV>/<KEEPCNT>].

As suggested in the resolved issues, setting the keepalive period (TCP_KEEPIDLE) resolves #812 by allowing to keep the control connection alive longer than the firewall timeout, and resolves #835 by allowing the server to terminate inactive connections after shorter time.

Added HAVE_TCP_KEEPALIVE, based on TCP_KEEPIDLE availability, as I am not sure if and which OSs support the TCP keepalive. In principle, there could be a separated "HAVE" for the socket (not TCP) SO_KEEPALIVE, but I assume this is not necessary.

(Note based on the discussion below starting with this comment: to use this PR before it is merged into main line, autoconf and autoheader should be run before ./configure. If autoconf fails with error that version 2.71 should be used, run autoconf -V to get its version and change "2.71" in configure.ac to that version.)

I was not able to fully test the code, as I was not able to test the keepalive retries.

Also, for some reason, in Windows WSL2 it seems that the keepalive period (TCP_KEEPIDLE) must be greater than the retry interval (TCP_KEEPINTV). Otherwise, the keepalive is sent only once. As I couldn't test the retries, I assumed that it may even be that the keepalive period must be greater than the retry period (TCP_KEEPINTV * TCP_KEEPCNT). Therefore, in case this condition is not met and only the keepalive period was set, I added code that reduces the retry interval accordingly.

davidBar-On avatar Nov 20 '22 15:11 davidBar-On

@ankitg12, I reply to your comment here, as building the change is using the code of this PR.

Any idea when this change would be merged ? Or how I can get binary with this change?

As this PR is not merged (yet) into the main code, there are probably no available binaries. You can try building it yourself. In general, the process is (under Linux):

git clone https://github.com/davidBar-On/iperf.git --branch issue-812-835-control-connection-keepalive
cd iperf
./bootstrap.sh
./confoigure
make

The iperf3 executable will be ./src/iperf3. To make sure it was built o.k. run ./src/iperf3 -h. The help should include a line for the new --cntl-ka option.

Note that the PR is not fully tested. If you were able to build and run iperf3 successfully and you find problems with the implementation of the new option, please report them in this PR. Thanks.

davidBar-On avatar Jan 10 '23 18:01 davidBar-On

I guess this is probably obvious, but I had to add #define HAVE_TCP_KEEPALIVE 1 in ./iperf/src/iperf_config.h before launching make to have the --cntl-ka option available.

I can confirm with netstat -tnope that the new parameter change the control channel behavior correctly:

KEEPIDLE = Modify the interval of the keepalive KEEPINTV = Modify the interval of the retry KEEPCNT = Modify the amount of keepalive lost before dropping the connection

I would be very happy to see this change merge into the master branch!

lbsou avatar Feb 21 '23 02:02 lbsou

.... I had to add #define HAVE_TCP_KEEPALIVE 1 in ./iperf/src/iperf_config ....

@lbsou, configure.h is generated by ./iperf3/configure per ./iperf3/configure.ac. Currently, configure.ac defines HAVE_TCP_KEEPALIVE is defined in confugure.h when netinet/tcp.h defines TCP_KEEPIDLE.

Which OS are you using? Can you check if and where TCP_KEEPIDLE is defined?

davidBar-On avatar Feb 21 '23 08:02 davidBar-On

Hi!

I tried on the following platforms :

  • Rocky Linux release 9.0 (Blue Onyx)
  • Ubuntu 20.04.5 LTS

The option is available on both system :

[root@ROCKY:/]# sysctl -a | grep keepalive net.ipv4.tcp_keepalive_intvl = 75 net.ipv4.tcp_keepalive_probes = 9 net.ipv4.tcp_keepalive_time = 7200

root@UBUNTU:/# sysctl -a | grep keepalive net.ipv4.tcp_keepalive_intvl = 75 net.ipv4.tcp_keepalive_probes = 9 net.ipv4.tcp_keepalive_time = 7200

I was able to compile and run the following code:

#include <stdio.h> #include <netinet/tcp.h>

int main(void) { int foo = TCP_KEEPIDLE; printf("%d", foo); }

Returned : 4

lbsou avatar Feb 21 '23 19:02 lbsou

Hi, thanks for the info! Hopefully I now understand the problem. When (if...) the PR will be merged into the mainline, autoconf and autoheader will be run to updated the ./cofigure file based on the ./configure.ac. Since the PR is not merged, ./configure does not handle the definition of HAVE_TCP_KEEPALIVE.

Can you try running autoconf? It should updated ./configure file, which defines HAVE_TCP_KEEPALIVE. If autoconf fails with error that version 2.71 should be used, run autoconf -V to get its version and change "2.71" in configure.ac to that version.

Now run autoheader and then ./configure, ./src/iperf_config.h should now define HAVE_TCP_KEEPALIVE.

Was this successful? If it did, I will add the above instructions to the PR description

davidBar-On avatar Feb 22 '23 19:02 davidBar-On

Thank you David, that was it!

I did autoconf then autoheader then ./configure and the ./src/iperf_config.h now contain #define HAVE_TCP_KEEPALIVE 1

Any idea when this change will be merged?

lbsou avatar Feb 22 '23 19:02 lbsou

@lbsou, I have updated the PR description. Thanks for your efforts and time for clarifying this issue (and helping me to better understand the configuration process ...).

I am not from the team that maintains iperf3, so I don't know if and when this PR will be merged into the mainline.

davidBar-On avatar Feb 22 '23 20:02 davidBar-On

Thank you David!

@bmah888 Can we be of any further help for this PR to be merged?

Thanks!

lbsou avatar Feb 22 '23 20:02 lbsou

Hi @davidBar-On !

I did some more testing and I can activate keepalive only from client to server, not from server to client.

Server : /usr/local/bin/iperf3 -s -i 1 -f k --forceflush --idle-timeout 300 --rcv-timeout 5000 --one-off -p 16000 --timestamps='%F %T ' --cntl-ka=10/1/5

# netstat -tnope | grep iperf3 tcp6 0 0 <server_ip>:16000 <client_ip>:6730 ESTABLISHED 0 40840968 3294693/iperf3 off (0.00/0/0)

Client : /usr/local/bin/iperf3 -u -l 1400 -c <server_ip> -t 0 -b 1M --udp-counters-64bit --connect-timeout=1000 --dscp 0 --pacing-timer 12000 -f k -p 16000 --timestamps='%F %T ' --bidir --forceflush --cntl-ka=10/1/5 --rcv-timeout 5000

# netstat -tnope | grep iperf3 tcp 0 0 192.168.1.146:47610 23.250.5.250:16001 ESTABLISHED 0 1629116 218390/iperf3 keepalive (4.59/0/0)

This means the server might not notice if the control channel dies (aggressive nat gateway session timeout?) and then the server is unreachable forever (in case of a bidir communication, the server continues sending udp packet indefinitely.).

lbsou avatar Mar 06 '23 14:03 lbsou

Hi @lbsou, thanks for testing and finding this issue. Actually, setting the keepalive only on the client side was intentional, and just the help for the option wrongly showed it for both the client and the server ...

However, when evaluating your comment I realized that it is better that the keepalive option will be available for both the client and server. I enhanced the solution and the option should work now on both sides.

davidBar-On avatar Mar 17 '23 16:03 davidBar-On

Hi @lbsou, thanks for testing and finding this issue.

Actually, setting the keepalive only on the client side was intentional, as the client is initiating the control stream, but I forgot to limit the option for the client. However, when evaluating your comment I realized that it is better that the keepalive option will be available for both the client and the server. I enhanced the solution and the option should work now on both sides.

davidBar-On avatar Mar 18 '23 07:03 davidBar-On

Thanks you very much @davidBar-On, may the gods of merging be with us!

lbsou avatar Mar 18 '23 10:03 lbsou

We might look into testing and merging this soon. We're interested in the changes. In the meantime, if you could update the man page for the new command line argument that would be great, otherwise we'll probably get to it once multi-threading is merged.

swlars avatar Oct 05 '23 17:10 swlars