sniproxy icon indicating copy to clipboard operation
sniproxy copied to clipboard

TCP Fast Open support

Open oakaigh opened this issue 6 years ago • 12 comments

Not sure of the implementation, but this feature is what really optimizes the tls performance. Sample

oakaigh avatar May 23 '18 02:05 oakaigh

@PantherJohn interesting, this LWN article has a more complete description of what I think you are proposing. It's simply enough of try, I would appreciate if you would like to do any performance testing. It's simple enough to include (tcp-fastopen branch) and try.

dlundquist avatar May 23 '18 05:05 dlundquist

@dlundquist Hi, dlundquist Glad it works!

Client

Before tcp_fastopen enabled on the remote side:

[root@cara sniproxy]# curl https://www.google.com:16443/ncr --resolve "www.google.com:16443:someserver" -s -w "%{time_total}\n"
-o /dev/null
1.855
[root@cara sniproxy]# curl https://www.google.com:16443/ncr --resolve "www.google.com:16443:someserver" -s -w "%{time_total}\n"
-o /dev/null
1.800
[root@cara sniproxy]# curl https://www.google.com:16443/ncr --resolve "www.google.com:16443:someserver" -s -w "%{time_total}\n"
-o /dev/null
0.769
[root@cara sniproxy]# curl https://www.google.com:16443/ncr --resolve "www.google.com:16443:someserver" -s -w "%{time_total}\n"
-o /dev/null
0.797
[root@cara sniproxy]# grep '^TcpExt:' /proc/net/netstat | cut -d ' ' -f 81-96 | column -t
TCPOFOMerge  TCPChallengeACK  TCPSYNChallenge  TCPFastOpenActive  TCPFastOpenActiveFail  TCPFastOpenPassive  TCPFastOpenPassiveFail  TCPFastOpenListenOverflow  TCPFastOpenCookieReqd  TCPFastOpenBlackhole  TCPSpuriousRtxHostQueues  BusyPollRxPackets  TCPAutoCorking  TCPFromZeroWindowAdv  TCPToZeroWindowAdv  TCPWantZeroWindowAdv
783          2145             531              1                  20                     0                   0                       0                          0                      0                     33                        6459028            52236           2111                  2112                42013
[root@cara sniproxy]# grep '^TcpExt:' /proc/net/netstat | cut -d ' ' -f 81-96 | column -t
TCPOFOMerge  TCPChallengeACK  TCPSYNChallenge  TCPFastOpenActive  TCPFastOpenActiveFail  TCPFastOpenPassive  TCPFastOpenPassiveFail  TCPFastOpenListenOverflow  TCPFastOpenCookieReqd  TCPFastOpenBlackhole  TCPSpuriousRtxHostQueues  BusyPollRxPackets  TCPAutoCorking  TCPFromZeroWindowAdv  TCPToZeroWindowAdv  TCPWantZeroWindowAdv
783          2145             531              1                  25                     0                   0                       0                          0                      0                     33                        6459128            52237           2111                  2112                42013

and After net.ipv4.tcp_fastopen = 2

[root@cara sniproxy]# curl https://www.google.com:16443/ncr --resolve "www.google.com:16443:someserver" -s -w "%{time_total}\n"
-o /dev/null
0.420
[root@cara sniproxy]# curl https://www.google.com:16443/ncr --resolve "www.google.com:16443:someserver" -s -w "%{time_total}\n"
-o /dev/null
0.389
[root@cara sniproxy]# curl https://www.google.com:16443/ncr --resolve "www.google.com:16443:someserver" -s -w "%{time_total}\n"
-o /dev/null
0.390
[root@cara sniproxy]# curl https://www.google.com:16443/ncr --resolve "www.google.com:16443:someserver" -s -w "%{time_total}\n"
-o /dev/null
0.416
[root@cara sniproxy]# curl https://www.google.com:16443/ncr --resolve "www.google.com:16443:someserver" -s -w "%{time_total}\n"
-o /dev/null
0.403
[root@cara sniproxy]# curl https://www.google.com:16443/ncr --resolve "www.google.com:16443:someserver" -s -w "%{time_total}\n"
-o /dev/null
0.414
[root@cara sniproxy]# grep '^TcpExt:' /proc/net/netstat | cut -d ' ' -f 81-96 | column -t
TCPOFOMerge  TCPChallengeACK  TCPSYNChallenge  TCPFastOpenActive  TCPFastOpenActiveFail  TCPFastOpenPassive  TCPFastOpenPassiveFail  TCPFastOpenListenOverflow  TCPFastOpenCookieReqd  TCPFastOpenBlackhole  TCPSpuriousRtxHostQueues  BusyPollRxPackets  TCPAutoCorking  TCPFromZeroWindowAdv  TCPToZeroWindowAdv  TCPWantZeroWindowAdv
783          2145             531              1                  27                     0                   0                       0                          0                      0                     33                        6459146            52237           2111                  2112                42013

Some notes

Client: net.ipv4.tcp_fastopen = 3 Server: net.ipv4.tcp_fastopen = 2 (I don't know why the connection failed when the value is set to 3)

oakaigh avatar May 23 '18 08:05 oakaigh

Oh wait!

Shouldn't we use TCP_FASTOPEN_CONNECT for 4.11+ while MSG_FASTOPEN for 3.7+?

2018-05-23 17:06:55 Parsed .* :443 2018-05-23 17:07:24 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:07:24 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:07:24 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:07:24 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:07:26 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:07:26 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:07:27 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:07:27 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:07:29 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:07:29 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:07:29 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:07:29 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:08:01 waitpid: No child processes 2018-05-23 17:12:40 Parsed . *:443 2018-05-23 17:14:43 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:14:43 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:14:44 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:14:44 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:14:44 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:14:44 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:14:45 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:14:45 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:14:46 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:14:46 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:14:46 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:14:46 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:14:52 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:14:52 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:14:52 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:14:52 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:20:17 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:20:17 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:20:17 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:20:17 Failed to open connection to 54.149.101.155:443: Operation not supported 2018-05-23 17:20:38 Failed to open connection to 52.84.235.235:443: Operation not supported 2018-05-23 17:20:39 Failed to open connection to 52.84.235.235:443: Operation not supported 2018-05-23 17:20:39 Failed to open connection to 52.84.235.235:443: Operation not supported 2018-05-23 17:20:40 Failed to open connection to 52.84.235.235:443: Operation not supported 2018-05-23 17:20:41 Failed to open connection to 52.84.235.235:443: Operation not supported 2018-05-23 17:20:41 Failed to open connection to 52.84.235.235:443: Operation not supported 2018-05-23 17:20:46 Failed to open connection to 52.84.235.235:443: Operation not supported 2018-05-23 17:20:47 Failed to open connection to 52.84.235.235:443: Operation not supported 2018-05-23 17:21:17 Failed to open connection to 52.84.235.235:443: Operation not supported 2018-05-23 17:21:18 Failed to open connection to 52.84.235.235:443: Operation not supported 2018-05-23 17:33:11 Failed to open connection to 52.84.235.235:443: Operation not supported 2018-05-23 17:33:12 Failed to open connection to 52.84.235.235:443: Operation not supported 2018-05-23 17:33:52 Failed to open connection to 52.84.235.235:443: Operation not supported 2018-05-23 17:33:52 Failed to open connection to 52.84.235.235:443: Operation not supported

https://github.com/yuryu/tfoecho/blob/master/client.cpp

oakaigh avatar May 23 '18 08:05 oakaigh

@dlundquist for kernel-version 4.11+, TCP_FASTOPEN_CONNECT should be used instead of MSG_FASTOPEN for 3.7+. This is a problem... :/ (4.16.10-1.el7.elrepo.x86_64)

oakaigh avatar May 23 '18 13:05 oakaigh

It looks like TCP_FASTOPEN_CONNECT is a newer API, that doesn't require a modified TCP client code path: https://www.mail-archive.com/[email protected]/msg149369.html. This looks like a much better approach since it doesn't introduce an alternative code path. I think for the sake of code maintainability, it not worth support TCP Fast Open as a client using MSG_FASTOPEN just to support Linux 3.7 - 4.11.

Based on the counters in your test it doesn't look like TCP Fast Open was preformed.

dlundquist avatar May 23 '18 14:05 dlundquist

@dlundquist Ummm... weird. TCPFastOpenActiveFail TCPFastOpenCookieReqd TCPFastOpenActive increment on both the client and the server side (I'm sure not other processes are sending the request) Now it turns out, server:

TCPOFOMerge  TCPChallengeACK  TCPSYNChallenge  TCPFastOpenActive  TCPFastOpenActiveFail  TCPFastOpenPassive  TCPFastOpenPassiveFail  TCPFastOpenListenOverflow  TCPFastOpenCookieReqd  TCPFastOpenBlackhole  TCPSpuriousRtxHostQueues  BusyPollRxPackets  TCPAutoCorking  TCPFromZeroWindowAdv  TCPToZeroWindowAdv  TCPWantZeroWindowAdv
0            2                3                11                 3                      0                   0                       0                          306                    0                     0                         0                  3105            54                    54                  99

client:

TCPOFOMerge  TCPChallengeACK  TCPSYNChallenge  TCPFastOpenActive  TCPFastOpenActiveFail  TCPFastOpenPassive  TCPFastOpenPassiveFail  TCPFastOpenListenOverflow  TCPFastOpenCookieReqd  TCPFastOpenBlackhole  TCPSpuriousRtxHostQueues  BusyPollRxPackets  TCPAutoCorking  TCPFromZeroWindowAdv  TCPToZeroWindowAdv  TCPWantZeroWindowAdv
783          2172             535              416                390                    0                   0                       0                          0                      0                     40                        6583906            52782           2119                  2120                42135

Failed to open connection to 52.84.235.235:443: Operation not supported

52.84.235.235 is the backend address. What does it mean, 52.84.235.235 cannot be fast opened? I remember the API has fall back when tfo fails.

tcpdump shows tcp_fastopen requested but not accepted by the server if tcp_fastopen is set to 3 on both sides meaning the cookie not generated.

oakaigh avatar May 23 '18 16:05 oakaigh

@dlundquist What is your client / server configuration? Seems on my server machine MSG_FASTOPEN is not compiled. I removed #ifdef MSG_FASTOPEN it worked but failed (glibc 2.17) glibc TCP_FASTOPEN_CONNECT doesn’t work. 😂🤣😂

oakaigh avatar May 25 '18 01:05 oakaigh

I'm developing on Debian 9.4 (libc 2.24). I verified it was using the MSG_FASTOPEN path using strace. It looks like TCP_FASTOPEN_CONNECT was only added a year ago. If you are running a recent kernel you might defining them in CFLAGS: CFLAGS='-DTCP_FASTOPEN=23 -DTCP_FASTOPEN_CONNECT=30' ./configure && make.

dlundquist avatar May 25 '18 01:05 dlundquist

@dlundquist 🤣 Only to find the problem persists. MSG_FASTOPEN was compiled when I fetched the initial commit but neither TCP_FASTOPEN nor TCP_FASTOPEN_CONNECT was working as I straced sniproxy daemon. #include <netinet/tcp.h>

oakaigh avatar May 25 '18 05:05 oakaigh

@dlundquist I’ve heard TCP Fast Open will NOT work under broken NAT gateways(mysterious to me), indicating the algorithm is still in its experimental stage. Maybe sometime in the future the problem can be solved. Can you please keep this issue opened for us to track further issues? Thanks in advance.

oakaigh avatar May 28 '18 01:05 oakaigh

Yeah, TCP Fast Open is still in the experimental phase. Given the relatively small size of this patch, I see no harm in leaving it around. Given the way TCP Fast Open requires cooperation between the client and server to use: a the client obtains a TCP Fast Open in an initial TCP connection that follow the usual three way handshake, then can use this cookie to send data during the three way handshake. I'm not sure it would be appropriate in all use cases of SNIproxy. Different use cases may want TCP Fast Open on only the front end of back end connections. This will probably need a listener configuration option to enable TCP Fast Open on frontend connection and/or backend connections or disable entirely before including in master.

dlundquist avatar May 28 '18 06:05 dlundquist

@dlundquist Yes. Some simple patch could make it work. I added MSG_FASTOPEN and received the fo_cookie

listen 0.0.0.0 443 {
    // Enable TCP Fast Open client & server
    fastopen yes
    table http_hosts 
    // ...
}

oakaigh avatar Jun 04 '18 05:06 oakaigh