Skywiremob does not react well to connectivity changes
Describe the bug If Skywiremob is used to create a VPN connection and the device loses the ability to be connected to the backend visor for some time, the transport is disconnected, the routes are removed and there is no way for Skywiremob to recover the connection to the backend visor, even if the problem which was blocking the connection is solved.
Environment information:
- OS: Android
Steps to Reproduce There are multiple ways to make the problem appear and the behavior is a bit different in every case. The first step would always be to create a functional VPN connection with Skywiremob and then the connection may be stopped in several ways:
-
With a device connected to the Internet only via mobile network (an emulator can be used for this), disable the mobile network connection, which will leave the device without Internet connectivity. If you reactivate the mobile network connection immediately, the local visor will recover the connection with the backend and the VPN will continue working, but if you wait like 30 seconds, the console will show errors (like
yamux: keepalive failed: i/o deadline reached) and after that point the local visor will not able to connect with the backend visor again. -
With a device connected only via WIFI the situation is similar, but it appears that there is no need to wait a moment for the connection to fail. In fact, disabling the WIFI connection and immediately connecting it again does not allow to recover the connection to the server.
-
When having WIFI and mobile connection, the behavior is different. The device tends to prefer the WIFI connection, but closing it will make the visor start using the mobile network immediately, so the connection with the backend visor is restored.
-
If you leave the Android connection without changes but close the backend visor, the log shows
level=error msg="Error resending traffic from VPN server to mobile app UDP conn" error=EOFand VPN does not work again after starting the backend visor again. However, an interesting difference is that in this case the transport appears to be recovered.
Actual behavior After the problem appears, the VPN connection stays unusable and the visor has to be restarted for it to work again. When this happens, the local visor continues running and it does not send any signal to the Android app, so there is no way to detect it for showing an indication to the user, restart the VPN service or stop it.
Expected behavior The connection should be recovered as soon as possible or at least there should be a way for the Android app to know about the problem.
@Senyoret1 hey there. this is actually how skywire works. we don't have any simple way to say that the remote node or some intermediary one failed and exclude it from the network. to achieve this we have keep-alives. and once no keep-alive received during some time interval, we consider the route spoiled and close connections, since it's not usable anyway. well, it should've been at least our own keep-alive error message, not sure what yamux has to do with this, probably missed that guy somewhere, but anyway the result is the same. I'll have to add some method to redial the connection once it's dead. also we'll need func to notify mobile app about the result of the process. it'll take some time, I'm on it, will let you know
I checked this problem again and saw that stopping the remote visor causes something similar: the VPN stops working and the app does not know about the error.
Today I uploaded some changes and now the Android app is able to restart the whole service without removing the protection (the network stays blocked while restarting the service). If making Skywiremob restart the connection automatically is more problematic than it should, an alternative would be to make Skywiremob inform the app about the problem and the app would simply restart the whole visor.