grpc-swift icon indicating copy to clipboard operation
grpc-swift copied to clipboard

Network Switching Wifi <-> LTE

Open decanus opened this issue 5 years ago • 14 comments

I seem to be having issues with this library when using bidirectional streams, peers seem to disconnect on network switches. Is this a familiar issue?

decanus avatar Nov 18 '20 19:11 decanus

gRPC doesn't do anything to facilitate a network switch. Using Network.framework via NIO Transport Services (see here) should help, however.

glbrntt avatar Nov 19 '20 08:11 glbrntt

But it seems to say that that is default?

decanus avatar Nov 19 '20 09:11 decanus

It's the default if supported where you're running and you use PlatformSupport.makeEventLoopGroup, yes.

glbrntt avatar Nov 19 '20 09:11 glbrntt

yup that is the case, but the connection still seems to fail when switching between networks.

decanus avatar Nov 19 '20 09:11 decanus

What are you connecting to?

Lukasa avatar Nov 19 '20 10:11 Lukasa

A go GRPC server

decanus avatar Nov 19 '20 10:11 decanus

What network is it on? Is it on your local LAN, or is it a server accessed over the internet?

Lukasa avatar Nov 19 '20 10:11 Lukasa

Over the internet.

decanus avatar Nov 19 '20 10:11 decanus

Ok, so this is expected behaviour with basically all networking libraries. When you change your networking environment, it is not necessarily possible to preserve a TCP connection. Let's use the two directions we're talking about here WRT Wifi and LTE. In this case you have two options: moving from wifi to cellular, and moving back from cellular to wifi.

Generally one moves from wifi to cellular because the wifi network is not available any longer. If that has happened, your TCP connection cannot continue: the packets are being routed to your phone's address on the wifi network, but your phone isn't there anymore. This connection will be lost, it is unavoidable.

When moving from cellular to wifi, in principle it is possible to keep existing connections alive. After all, just because you have a wifi network doesn't mean the cellular connection was lost! In this case it's a matter for the OS to decide whether it wants to disconnect from the cellular radio and route data over the wifi. If it choses to do that, we will again lose the connection. This is not guaranteed to happen: the OS is free to decide what it wants to do here, and it is possible our connections will stay up for some period of time.

There are some ways in principle to work around this. The best available solution for grpc-over-HTTP/2 is to enable multi path TCP. Multipath TCP allows the phone and server to collaborate together to associate the same connection with multiple network paths. When used in this way it is possible to "fail over" a connection to the best available interface. The NIOTSEventLoopGroup can be used to configure MPTCP if the server is appropriately configured to support it as well, but this requires active configuration effort from the server operator.

Future networking libraries may make this easier. The IETF is constantly working to solve the problem of so-called "connection mobility". QUIC has some interesting technologies that may address this problem.

Ultimately, however, your application needs to be resilient to the possibility of the networking environment changing underneath it. Connections can fail for all kinds of reasons, and your application will need to re-establish those connections and understand how to manage the possibility of data loss events.

Lukasa avatar Nov 19 '20 10:11 Lukasa

Hi,

now that Network.framework supports QUIC (on iOS 15), is there any plan to support gRPC over QUIC ?

Antonito avatar Jul 13 '21 20:07 Antonito

In my view, there is minimal utility in doing this until it’s interoperable with a widespread specification.

Lukasa avatar Jul 13 '21 20:07 Lukasa

If I'm not mistaken, QUIC has been standardized a few months ago?

I read some discussions about gRPC over QUIC and gRPC over HTTP3 – I get that HTTP3 isn't standardized yet.

Yet some apps (I'm thinking Google Duo for instance) already use gRPC over QUIC, but it's complex to implement as of today without relying on the C++ implementations. I think the Swift community could benefit from this support, especially if it can be leveraged by Network.framework.

I understand your position as a library maintainer though.

I'm building a communication app which could heavily benefit from QUIC support (I'm currently using gRPC over HTTP2 and WebRTC, and would like to ultimately transition all my networking over QUIC) – what would be the steps required to play around with this? A 'simple' fork of swift-nio-transport-services to add QUIC + fork of this repository to switch the transport from HTTP2 to QUIC, provided my servers are QUIC-ready ?

Also, could you detail a bit more how to enable MPTCP with the current implementation? Unless I'm mistaken, neither PlatformSupport.makeEventLoopGroup nor ClientConnection.Builder provide options to specify a value for NWParameters. multipathServiceType. (maybe this could be explained more clearly in a docs/ file?)

Thanks a lot for your time & help!

Antonito avatar Jul 14 '21 13:07 Antonito

I'm not saying that QUIC hasn't been specced, I'm saying that gRPC over QUIC hasn't been specced yet. I don't think the C++ gRPC implementation has any support for QUIC (see grpc/grpc#19126).

If you wanted to play around with this you would need to fork both swift-nio-transport-services and grpc-swift and then perform a fairly substantial rewrite, to avoid using swift-nio-http2.

Adding MPTCP should be a separate issue, because I think it's best solved by letting you pass an NWProtocolOptions.TCP. We've added this recently for NWProtocolOptions.TLS, so I think an equivalent API would be reasonable.

Lukasa avatar Jul 14 '21 13:07 Lukasa

Alright, thanks for the details – I'll give it a try whenever I have some free days.

I thought this issue would suit well for MPTCP, as it was mentioned earlier and is one of the solution to the original problem – but I'll open another issue, sure :) Also agree with your proposed solution, it's consistent with existing behavior and straight forward to use.

Antonito avatar Jul 14 '21 13:07 Antonito