ai/live: Terminate stream on ICE disconnect.
While the disconnected state is not necessarily terminal - media may start flowing again before the ICE timeout [1] - this happens rarely enough [2] so let's just kill the peerconnection to avoid other timeouts later on in the process, eg segment copy.
[1] Easy way to test: plug in an Ethernet cable, disable WiFi, unplug the cable, wait a little more than 5 seconds, re-plug. Things should be fine in this case.
[2] Three occurrences of the state sequence "disconnected -> connected" in prod and 1 on staging, both in the past 30 days.
So what will happen now? The frontend broadcast compoenent will retry it, right?
If the app is still there, yes it should - which happens rarely. (See the dashboards that I shared in Discord; typically less than 1% of user connections exhibit this behavior) More often, the app just goes away which is why we just terminate the stream now.
Testing with the Daydream app does show that it reconnects on a disconnect so that seems okay.
On a mechanical level: the app enters the disconnected state because there is no connectivity between the app and the server. While the server will tear down the connection with this PR, the DTLS close_notify usually won't make it to the client because there is no connectivity. My tiny client app times out into a failed state a few seconds after disconnect.
I am not sure exactly what are the conditions that make Daydream retry, but that seems to be OK.
So what will happen now? The frontend broadcast compoenent will retry it, right?
If the app is still there, yes it should - which happens rarely. (See the dashboards that I shared in Discord; typically less than 1% of user connections exhibit this behavior) More often, the app just goes away which is why we just terminate the stream now.
Testing with the Daydream app does show that it reconnects on a disconnect so that seems okay.
On a mechanical level: the app enters the
disconnectedstate because there is no connectivity between the app and the server. While the server will tear down the connection with this PR, the DTLS close_notify usually won't make it to the client because there is no connectivity. My tiny client app times out into afailedstate a few seconds afterdisconnect.I am not sure exactly what are the conditions that make Daydream retry, but that seems to be OK.
Ok, LGTM
I have been re-reviewing the data - the rate of non-terminal disconnects is a bit higher when excluding e2e tests so I'm now leaning more towards https://github.com/livepeer/go-livepeer/pull/3642
https://github.com/livepeer/go-livepeer/pull/3642 has been working well so closing this in favor of that