neat
neat copied to clipboard
Cannot open SCTP connections from FreeBSD to Linux
When running tneat
as a server on Linux as follows:
$ ./tneat -P prop_sctp.json
and connecting to this server with tneat on FreeBSD as follows:
$ ./tneat -P prop_sctp.json <IP address>
on_connected
is never called on the client side on FreeBSD. However, this works when both machines are running Linux. This also works when e.g. TCP is used, so the problem seems to be related to how NEAT handles SCTP and how it is handled differently on Linux and FreeBSD.
The Linux machine ran Ubuntu 16.04.1 LTS 4.13.0-32-generic. The FreeBSD machine ran FreeBSD 11.1-RELEASE r321309.
The problem also occurs when trying to connect from FreeBSD to FreeBSD.
I ran socat
at both sides to check whether SCTP works, but it doesn't. It seems like the problem is with the SCTP implementation and not with NEAT.
socat
client running in FreeBSD:
2018/02/01 00:30:45 socat[901] I socat by Gerhard Rieger and contributors - see www.dest-unreach.org 2018/02/01 00:30:45 socat[901] I This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit. (http://www.openssl.org/) 2018/02/01 00:30:45 socat[901] I This product includes software written by Tim Hudson ([email protected]) 2018/02/01 00:30:45 socat[901] N reading from and writing to stdio 2018/02/01 00:30:45 socat[901] N opening connection to LEN=16 AF=2 192.168.10.10:12327 2018/02/01 00:30:45 socat[901] I starting connect loop 2018/02/01 00:30:45 socat[901] I socket(2, 1, 132) -> 5 2018/02/01 00:30:45 socat[901] E connect(5, LEN=16 AF=2 192.168.10.10:12327, 16): Connection reset by peer 2018/02/01 00:30:45 socat[901] N exit(1) 2018/02/01 00:30:45 socat[901] I shutdown(5, 2) 2018/02/01 00:30:45 socat[901] I shutdown(5, 2): Socket is not connected
socat
server running in Linux:
2018/02/01 01:30:40 socat[2676] I socat by Gerhard Rieger - see www.dest-unreach.org 2018/02/01 01:30:40 socat[2676] I This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit. (http://www.openssl.org/) 2018/02/01 01:30:40 socat[2676] I This product includes software written by Tim Hudson ([email protected]) 2018/02/01 01:30:40 socat[2676] N reading from and writing to stdio 2018/02/01 01:30:40 socat[2676] I socket(2, 1, 132) -> 5 2018/02/01 01:30:40 socat[2676] I starting accept loop 2018/02/01 01:30:40 socat[2676] N listening on AF=2 0.0.0.0:12327 2018/02/01 01:30:45 socat[2676] I accept(5, {2, AF=2 192.168.20.40:28199}, 16) -> 6 2018/02/01 01:30:45 socat[2676] N accepting connection from AF=2 192.168.20.40:28199 on AF=2 192.168.10.10:12327 2018/02/01 01:30:45 socat[2676] I permitting connection from AF=2 192.168.20.40:28199 2018/02/01 01:30:45 socat[2676] I close(5) 2018/02/01 01:30:45 socat[2676] I resolved and opened all sock addresses 2018/02/01 01:30:45 socat[2676] N starting data transfer loop with FDs [0,1] and [6,6] 2018/02/01 01:30:49 socat[2676] W read(6, 0x23cc710, 8192): Connection reset by peer 2018/02/01 01:30:49 socat[2676] N socket 2 to socket 1 is in error 2018/02/01 01:30:49 socat[2676] N socket 2 (fd 6) is at EOF 2018/02/01 01:30:50 socat[2676] I poll timed out (no data within 0.500000 seconds) 2018/02/01 01:30:50 socat[2676] I shutdown(6, 2) 2018/02/01 01:30:50 socat[2676] N exiting with status 0
Can you capture the traffic on the FreeBSD box on all interfaces and provide the capture file? The connect()
call returns with an error indicating "Connection reset by peer". So I guess we'll see an ABORT chunk. Not sure which box is sending it and why. Most likely suspect: a middlebox...
Are those real machines or VMs? If VMs: which adapter type do you use?
The machines are VMs. Both machines has a NAT adapter and an Internal Network adapter.
Data captured on the FreeBSD internal network interface (em1):
23:12:42.430110 IP6 fe80::a00:27ff:fe3b:b9c9 > ip6-allrouters: ICMP6, router solicitation, length 16 23:12:57.275057 IP 192.168.20.40.38101 > 192.168.10.10.12327: sctp (1) [INIT] [init tag: 2225606087] [rwnd: 1864135] [OS: 10] [MIS: 2048] [init TSN: 3768621358] 23:12:57.376455 IP 192.168.10.10.12327 > 192.168.20.40.38101: sctp (1) [INIT ACK] [init tag: 41739531] [rwnd: 106496] [OS: 10] [MIS: 10] [init TSN: 4268204033] 23:12:57.376516 IP 192.168.20.40.38101 > 192.168.10.10.12327: sctp (1) [COOKIE ECHO] 23:12:57.479004 IP 192.168.10.10.12327 > 192.168.20.40.38101: sctp (1) [COOKIE ACK] 23:13:02.398287 ARP, Request who-has 192.168.20.40 tell 192.168.20.20, length 46 23:13:02.398327 ARP, Reply 192.168.20.40 is-at 08:00:27:53:f7:fb (oui Unknown), length 28
Data captured on the FreeBSD loopback interface (lo0):
23:12:57.479029 IP 192.168.20.40.38101 > 10.0.2.15.12327: sctp (1) [HB REQ] 23:12:57.479071 IP 10.0.2.15.12327 > 192.168.20.40.38101: sctp (1) [ABORT]
The ABORT is captured on the loopback interface sent from the NAT service. When disabling the NAT adapter, SCTP works properly. I don't understand why the connection should fail even though the 4-way handshake has completed between the endpoints though... Or how to get SCTP to work when the NAT interface is enabled. I tried to specify only the internal network IP with the local_ips
property, but still the same problem.
If you specify the local IP addresses on both sides, it should work. There should be not address parameters in the INIT or INIT-ACK chunk. We can chat in Oslo about why it doesn't work with NAT boxes being involved...
Ah, that makes sense. I noticed that there are address parameters in the INIT and INIT-ACK chunks, so I attempted to connect the socat
client to a NEAT server with the NEAT_TAG_LOCAL_NAME
optional argument set to the local IP address and it works! However, when I attempt to connect a NEAT client to the same server, the SCTP association is established, but on_connected is not called for the client.
Packet capture: em1.txt
21:18:01.335896 IP 192.168.20.40.39895 > 192.168.10.10.12327: sctp (1) [INIT] [init tag: 3472062275] [rwnd: 1864135] [OS: 123] [MIS: 123] [init TSN: 3735537264] 21:18:01.437294 IP 192.168.10.10.12327 > 192.168.20.40.39895: sctp (1) [INIT ACK] [init tag: 2814136528] [rwnd: 106496] [OS: 123] [MIS: 123] [init TSN: 2458143342] 21:18:01.437326 IP 192.168.20.40.39895 > 192.168.10.10.12327: sctp (1) [COOKIE ECHO] 21:18:01.538415 IP 192.168.10.10.12327 > 192.168.20.40.39895: sctp (1) [COOKIE ACK] 21:18:06.561159 ARP, Request who-has 192.168.20.40 tell 192.168.20.20, length 46 21:18:06.561181 ARP, Reply 192.168.20.40 is-at 08:00:27:53:f7:fb (oui Unknown), length 28 21:18:32.539597 IP 192.168.20.40.39895 > 192.168.10.10.12327: sctp (1) [HB REQ] 21:18:32.640349 IP 192.168.10.10.12327 > 192.168.20.40.39895: sctp (1) [HB ACK] 21:18:36.671618 IP 192.168.10.10.12327 > 192.168.20.40.39895: sctp (1) [HB REQ] 21:18:36.671649 IP 192.168.20.40.39895 > 192.168.10.10.12327: sctp (1) [HB ACK] 21:18:41.631740 ARP, Request who-has 192.168.20.40 tell 192.168.20.20, length 46 21:18:41.631756 ARP, Reply 192.168.20.40 is-at 08:00:27:53:f7:fb (oui Unknown), length 28 21:19:03.205763 IP 192.168.20.40.39895 > 192.168.10.10.12327: sctp (1) [HB REQ] 21:19:03.307677 IP 192.168.10.10.12327 > 192.168.20.40.39895: sctp (1) [HB ACK] 21:19:08.947087 IP 192.168.20.40.39895 > 192.168.10.10.12327: sctp (1) [SHUTDOWN] 21:19:09.049638 IP 192.168.10.10.12327 > 192.168.20.40.39895: sctp (1) [SHUTDOWN ACK] 21:19:09.049667 IP 192.168.20.40.39895 > 192.168.10.10.12327: sctp (1) [SHUTDOWN COMPLETE]
NEAT client output:
Connecting to: 192.168.10.10:12327 Receive buffer size: 10240 bytes Rate of sending requests: 0 Number of requests to send: 1 NEAT log level: 4 User log level: 2 Statistics log rate: 0 Multihoming: disabled Send HTTP requests for data objects of size: 10 bytes Use the following properties: { "transport": { "value": ["SCTP"], "precedence": 1 }, "local_ips": [ { "value": "192.168.20.40", "precedence": 1 } ] }
Initializing NEAT context... [ 0.000000][INF] nt_log_init - opening logfile ... [ 0.000000][DBG] neat_init_ctx [ 0.000103][INF] Available src-addresses: [ 0.000111][INF] IPv6: fe80::a00:27ff:fe63:5b51/0 pref 4294967295 valid 4294967295 [ 0.000116][DBG] nt_run_event_cb [ 0.000122][INF] Available src-addresses: [ 0.000127][INF] IPv4: 10.0.2.15/0 [ 0.000132][INF] IPv6: fe80::a00:27ff:fe63:5b51/0 pref 4294967295 valid 4294967295 [ 0.000137][DBG] nt_run_event_cb [ 0.000142][INF] Available src-addresses: [ 0.000146][INF] IPv4: 192.168.20.40/0 [ 0.000151][INF] IPv4: 10.0.2.15/0 [ 0.000156][INF] IPv6: fe80::a00:27ff:fe63:5b51/0 pref 4294967295 valid 4294967295 [ 0.000160][DBG] nt_run_event_cb [ 0.000166][INF] Available src-addresses: [ 0.000170][INF] IPv6: ::1/0 pref 4294967295 valid 4294967295 [ 0.000175][INF] IPv4: 192.168.20.40/0 [ 0.000180][INF] IPv4: 10.0.2.15/0 [ 0.000185][INF] IPv6: fe80::a00:27ff:fe63:5b51/0 pref 4294967295 valid 4294967295 [ 0.000189][DBG] nt_run_event_cb [ 0.000195][INF] Available src-addresses: [ 0.000200][INF] IPv6: fe80::1/0 pref 4294967295 valid 4294967295 [ 0.000204][INF] IPv6: ::1/0 pref 4294967295 valid 4294967295 [ 0.000209][INF] IPv4: 192.168.20.40/0 [ 0.000214][INF] IPv4: 10.0.2.15/0 [ 0.000219][INF] IPv6: fe80::a00:27ff:fe63:5b51/0 pref 4294967295 valid 4294967295 [ 0.000223][DBG] nt_run_event_cb [ 0.000228][INF] Available src-addresses: [ 0.000232][INF] IPv4: 127.0.0.1/0 [ 0.000237][INF] IPv6: fe80::1/0 pref 4294967295 valid 4294967295 [ 0.000242][INF] IPv6: ::1/0 pref 4294967295 valid 4294967295 [ 0.000246][INF] IPv4: 192.168.20.40/0 [ 0.000251][INF] IPv4: 10.0.2.15/0 [ 0.000256][INF] IPv6: fe80::a00:27ff:fe63:5b51/0 pref 4294967295 valid 4294967295 [ 0.000260][DBG] nt_run_event_cb Creating new flow (1)... [ 0.000268][DBG] neat_new_flow [ 0.000323][INF] neat_new_flow - new flow created: 0x802659000 [ 0.000331][DBG] neat_set_property [ 0.000335][DBG] neat_set_property - { "transport": { "value": ["SCTP"], "precedence": 1 }, "local_ips": [ { "value": "192.168.20.40", "precedence": 1 } ] } [ 0.000359][DBG] neat_set_operations [ 0.000364][DBG] nt_update_poll_handle [ 0.000369][DBG] nt_update_poll_handle - loop is NULL or handle is closing - skipping Opening connection to 192.168.10.10 (1)... [ 0.000375][DBG] neat_open [ 0.000382][DBG] nt_add_event_cb [ 0.000386][INF] nt_add_event_cb - Added new callback for event type 0 [ 0.000391][DBG] nt_add_event_cb [ 0.000395][INF] nt_add_event_cb - Added new callback for event type 2 [ 0.000430][DBG] nt_add_event_cb [ 0.000435][INF] nt_add_event_cb - Added new callback for event type 0 [ 0.000439][DBG] send_properties_to_pm [ 0.000470][DBG] fe80::a00:27ff:fe63:5b51%em0 is a link-local address, skipping [ 0.000481][DBG] send_properties_to_pm: no match [ 0.000489][DBG] Added endpoint "192.168.20.40@em1" to PM request [ 0.000495][DBG] send_properties_to_pm: no match [ 0.000508][DBG] fe80::1%lo0 is a link-local address, skipping [ 0.000513][DBG] send_properties_to_pm: no match [ 0.000522][DBG] nt_json_send_once [ 0.000568][DBG] neat_get_event_loop Starting event loop... [ 0.000575][DBG] neat_start_event_loop [ 0.000582][DBG] on_pm_error [ 0.000586][DBG] on_pm_error [ 0.000590][DBG] ===== Unable to communicate with PM, using fallback =====, error code = 1 [ 0.001754][DBG] open_resolve_cb [ 0.001830][DBG] Address found [ 0.001849][DBG] nt_he_open [ 0.001854][DBG] HE Candidate 0: em1 [ 2] SCTP/IPv4 <saddr 192.168.20.40> <dstaddr 192.168.10.10> port 12327 priority 0 [ 0.001874][DBG] nt_find_multistream_socket [ 0.001878][DBG] nt_find_multistream_socket - 0x802659000 : skipping - self... [ 0.001883][DBG] nt_wait_for_multistream_socket [ 0.001887][DBG] nt_wait_for_multistream_socket - 0x802659000 : skipping - self... [ 0.001892][DBG] HE will now commence [ 0.001898][DBG] on_he_connect_req [ 0.001903][DBG] nt_connect [ 0.001927][INF] nt_connect: Bind fd 14 to 192.168.20.40 [ 0.001947][DBG] SCTP stream negotiation - offering : 123 in / 123 out [ 0.002489][DBG] on_he_connect_req: Connect successful for fd 14, ret = 0 [ 0.204932][DBG] he_connected_cb [ 0.204960][DBG] Invokation count: 1 - flow: 0x802659000 [ 0.204965][DBG] HE Candidate connected: em1 [ 2] SCTP/IPv4 <saddr 192.168.20.40> <dstaddr 192.168.10.10> port 12327 priority 0 [ 0.205015][DBG] he_connected_cb - Connection status: 0 - No error: 0 [ 0.205021][DBG] First successful connect (flow->hefirstConnect) [ 0.205026][DBG] send_result_connection_attempt_to_pm [ 0.205042][INF] Sending HE result to PM for caching [ 0.205047][DBG] nt_json_send_once_no_reply [ 0.205080][DBG] uvpollable_cb [ 0.205085][DBG] uvpollable_cb - awaiting notifications, socket not readable yet, skipping... [ 0.205089][DBG] nt_update_poll_handle [ 0.205094][DBG] nt_update_poll_handle - events - starting poll - readable : 1 - writable : 0 [ 0.205099][DBG] uvpollable_cb - finished [ 0.205122][DBG] on_pm_error [ 0.205142][DBG] on_pm_he_error [ 0.205146][DBG] Unable to communicate with PM, error code = 1 [ 0.205161][DBG] uvpollable_cb [ 0.205166][DBG] uvpollable_cb - awaiting notifications [ 0.205170][DBG] io_readable [ 0.205194][DBG] io_readable - got SCTP_RCVINFO [ 0.205198][DBG] io_readable - Received 22 bytes on SCTP stream 0 [ 0.205203][INF] SCTP event notification [ 0.205208][DBG] handle_sctp_event [ 0.205213][DBG] handle_sctp_assoc_change [ 0.205217][DBG] handle_sctp_assoc_change - state : SCTP_COMM_UP [ 0.205221][DBG] handle_sctp_assoc_change - supported features [ 0.205226][DBG] - PR [ 0.205244][DBG] - MULTIBUF [ 0.205249][DBG] nt_update_poll_handle [ 0.205253][DBG] nt_update_poll_handle - events - starting poll - readable : 1 - writable : 0 [ 0.205271][DBG] uvpollable_cb - finished Received signal Signal received: SIGINT Stop event loop in signal handler... [ 67.607843][DBG] neat_stop_event_loop [ 67.613428][DBG] neat_free_ctx [ 67.613454][DBG] nt_remove_event_cb [ 67.613459][INF] nt_remove_event_cb - Removed callback for type 0 [ 67.613463][DBG] nt_remove_event_cb [ 67.613467][INF] nt_remove_event_cb - Removed callback for type 2 [ 67.613484][DBG] nt_close_socket [ 67.613553][DBG] nt_free_flow [ 67.613562][INF] nt_free_flow - removing 0x802659000 [ 67.613568][DBG] nt_free_candidates [ 67.613574][DBG] nt_free_candidate [ 67.613579][DBG] nt_free_candidate: Handle used by flow, flow should release it [ 67.613584][DBG] nt_free_flow - closing handle and waiting for socket_handle_free_cb [ 67.613591][DBG] nt_walk_cb [ 67.613615][DBG] nt_walk_cb - closing handle [ 67.613621][DBG] nt_walk_cb [ 67.613626][DBG] nt_walk_cb - closing handle [ 67.613637][DBG] nt_walk_cb [ 67.613642][DBG] nt_walk_cb - handle->type == UV_IDLE - skipping [ 67.613646][DBG] nt_walk_cb [ 67.613651][DBG] nt_walk_cb [ 67.613660][DBG] synchronous_free [ 67.613666][DBG] synchronous_free - neat_resolver_free_results [ 67.613770][INF] nt_log_close - closing logfile ... Left the event loop... ERROR: Client 1 failed to connect Start time: Sun Feb 4 20:18:01 2018 End time: Sun Feb 4 20:19:08 2018 Elapsed time: 67.615185 s Average throughput: 0.000000 Mbit/s
As can be seen from the client log, the only SCTP event notification received is the SCTP_ASSOC_CHANGE event. No SCTP_ADAPTATION_INDICATION event is received, etc.
When running tneat as a server on Linux as follows: .... The Linux machine ran Ubuntu 16.04.1 LTS 4.13.0-32-generic. ... As can be seen from the client log, the only SCTP event notification received is the SCTP_ASSOC_CHANGE event. No SCTP_ADAPTATION_INDICATION event is received, etc.
You're at risk of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7906b00f5cd1cd484fced7fcda892176e3202c8a v4.13.3 or newer should contain this fix.