jicofo icon indicating copy to clipboard operation
jicofo copied to clipboard

Conference is not moved to another jvb when JVB's XMPP connection is broken

Open paweldomas opened this issue 1 year ago • 1 comments

Description

When Octo is disabled and JVB gets unresponsive or the XMPP connection is broken (we've seen this when near out of memory and struggling on garbage collector), any XMPP request will timeout towards the jvb. If there's a conference on such bridge and a new user joins then it will not be added to the conference correctly.

Current behavior

If JVB gets unresponsive, Jicofo fails to add new participants with "A new bridge was selected, but Octo is disabled" error after timing out on the Colibri v2 allocate request and the conference is not moved to another JVB. This is the Octo disabled case.

What would probably happen with Octo enabled, is that the other participants would stay isolated, but if the JVB is really not working then each one should be getting ICE failed and they should make attempt to get new session and eventually could be moved to the same jvb as the new participant.

Expected Behavior

If JVB gets unresponsive for a longer period of time, Jicofo should allow the new user to connect by moving the call to another bridge.

Possible Solution

Move the conference to another JVB when a bridge is timing out on all requests.

Steps to reproduce

  1. Start a conference with 2 participants on JVB 1.
  2. Cut-off the XMPP connection using iptables on the JVB 1 machine:

iptables -A OUTPUT -p tcp --dport 5222 -j DROP

  1. Wait around 30 seconds until Jicofo starts reporting health check timeouts for JVB 1
  2. Try joining the meeting with 3rd participant.

paweldomas avatar Mar 09 '23 16:03 paweldomas

As a workaround it may be possible to enable Octo in Jicofo, but keep it disabled on the JVB.

paweldomas avatar Mar 09 '23 19:03 paweldomas