ice icon indicating copy to clipboard operation
ice copied to clipboard

Only switch to failed after all candidates are exhausted?

Open jech opened this issue 5 years ago • 7 comments

As pointed out in https://github.com/pion/webrtc/issues/1212#issuecomment-686932358 , Pion's handling of the failed state is somewhat naive: Pion will switch to failed if connectivity cannot be established even if it hasn't received an end-of-candidates indication.

There's a tradeoff here: the current behaviour works well enough in practice, and it gracefully handles the case when the peer never sends an end-of-candidates indication. On the other hand, it means that if the peer is performing expensive gathering (e.g. because a TURN server is overloaded), we might spuriously switch into failed.

jech avatar Sep 05 '20 12:09 jech

@Sean-Der just clarifying, it should only fail if end-of-candidate is detected? if so how do we handle it gracefully if the peer never sends an end-of-candidate indication?

Cause it would break this test (potentially implementation). https://github.com/pion/ice/blob/62d1c40d60d65116202d743cb47bda9b1a78cd9f/agent_test.go#L1307.

scorpionknifes avatar Nov 30 '20 03:11 scorpionknifes

Hey @scorpionknifes

Sorry for the confusion, but I don't think we can move forward on this. Behavior between browsers is inconsistent. I am afraid to implement something and then have browsers diverge. I am sorry I didn't do more research before recommending this issue :(

  • https://stackoverflow.com/a/51641822/5472819
  • https://github.com/meetecho/janus-gateway/issues/1670

Sean-Der avatar Nov 30 '20 05:11 Sean-Der

We should love your PR though, and maybe engage with webrtc-pc on what is the proper behavior?

Sean-Der avatar Nov 30 '20 05:11 Sean-Der

Small timeout when eoc is received, larger timeout if it isn't?

jech avatar Dec 06 '20 22:12 jech

@jech cool idea. idk if this is a feature people want?

scorpionknifes avatar Dec 07 '20 10:12 scorpionknifes

There's a tradeoff here: on the one hand, we don't want to give up too early, on the other hand, we want to switch to failed as early as possible so as to give the user timely feedback and possibly trigger an ICE restart. Question: would using the end-of-candidates indication allow us to fail faster without breaking connectivity in cases where it currently works? I'm hoping somebody will do the necessary experiments, and I sure hope that won't be me ;-)

jech avatar Dec 07 '20 13:12 jech

There is some related discussion at https://tools.ietf.org/html/draft-ietf-ice-trickle-21#section-14.

jech avatar Dec 10 '20 01:12 jech