jitsi-videobridge
jitsi-videobridge copied to clipboard
Health check does not detect broken MUC connection
Description
The videobridge health check (especially the /about/health endpoint) does not detect when the videobrdige is unable to connect to a configured MUC (https://github.com/jitsi/jitsi-videobridge/blob/master/doc/muc.md)
Current behavior
The log file shows the connection errors, but the health check endpoint returns 200, even after waiting for a few minutes.
Expected Behavior
Health check endpoint should return 500
Possible Solution
The code seems to only check that the XMPP component connections are alive, not the MUC connections.
https://github.com/jitsi/jitsi-videobridge/blob/06cfe3eeb94f68621ca75aec5cbe9205f55cea5d/src/main/java/org/jitsi/videobridge/health/Health.java#L116-L135
Steps to reproduce
Stop the prosody server
Run curl -v localhost:8080/about/health
and note that it returns 200 instead of 500
Environment details
jitsi-videobridge2 2.1-183-gdbddd169-1
JVB is started with --apis=rest
(note the missing XMPP API) because the component is not configured in prosody due to the use of the MUC only.
Had this happen again when Prosody was restarted. First the connection fails, then it successfully reconnects but fails to join the MUC because it hasn't been created yet. But even though it's not in the MUC it still returns success on the health check. The only way to recover from this is to manually restart JVB which is very annoying.
Dec 13, 2020 8:52:52 PM org.jivesoftware.smack.AbstractXMPPConnection callConnectionClosedOnErrorListener
WARNING: Connection XMPPTCPConnection[[email protected].[redacted]/1Lumvoze] (0) closed with error
org.jivesoftware.smack.XMPPException$StreamErrorException: system-shutdown You can read more about the meaning of this stream error at http://xmpp.org/
rfcs/rfc6120.html#streams-error-conditions
<stream:error><system-shutdown xmlns='urn:ietf:params:xml:ns:xmpp-streams'/><text>Received SIGTERM</text></stream:error>
at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketReader.parsePackets(XMPPTCPConnection.java:1064)
at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketReader.access$300(XMPPTCPConnection.java:1000)
at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketReader$1.run(XMPPTCPConnection.java:1016)
at java.lang.Thread.run(Thread.java:748)
Dec 13, 2020 8:52:52 PM org.jitsi.utils.logging2.LoggerImpl log
WARNING: Closed on error:
org.jivesoftware.smack.XMPPException$StreamErrorException: system-shutdown You can read more about the meaning of this stream error at http://xmpp.org/
rfcs/rfc6120.html#streams-error-conditions
<stream:error><system-shutdown xmlns='urn:ietf:params:xml:ns:xmpp-streams'/><text>Received SIGTERM</text></stream:error>
at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketReader.parsePackets(XMPPTCPConnection.java:1064)
at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketReader.access$300(XMPPTCPConnection.java:1000)
at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketReader$1.run(XMPPTCPConnection.java:1016)
at java.lang.Thread.run(Thread.java:748)
Dec 13, 2020 8:52:57 PM org.jitsi.utils.logging2.LoggerImpl log
INFO: Performed a successful health check in PT0S. Sticky failure: false
Dec 13, 2020 8:53:05 PM org.jitsi.utils.logging2.LoggerImpl log
WARNING: Reconnection failed:
org.jivesoftware.smack.SmackException$ConnectionException: The following addresses failed: 'jitsi-prosody:5222' failed because: jitsi-prosody/192.168.6
9.35 exception: java.net.ConnectException: Connection refused (Connection refused)
at org.jivesoftware.smack.SmackException$ConnectionException.from(SmackException.java:278)
at org.jivesoftware.smack.tcp.XMPPTCPConnection.connectUsingConfiguration(XMPPTCPConnection.java:619)
at org.jivesoftware.smack.tcp.XMPPTCPConnection.connectInternal(XMPPTCPConnection.java:902)
at org.jivesoftware.smack.AbstractXMPPConnection.connect(AbstractXMPPConnection.java:383)
at org.jivesoftware.smack.ReconnectionManager$2.run(ReconnectionManager.java:289)
at java.lang.Thread.run(Thread.java:748)
Dec 13, 2020 8:53:07 PM org.jitsi.utils.logging2.LoggerImpl log
INFO: Performed a successful health check in PT0S. Sticky failure: false
Dec 13, 2020 8:53:07 PM org.jitsi.utils.logging2.LoggerImpl log
INFO: Running expire()
Dec 13, 2020 8:53:17 PM org.jitsi.utils.logging2.LoggerImpl log
INFO: Performed a successful health check in PT0S. Sticky failure: false
Dec 13, 2020 8:53:17 PM org.jitsi.utils.logging2.LoggerImpl log
INFO: Connected.
Dec 13, 2020 8:53:17 PM org.jitsi.utils.logging2.LoggerImpl log
INFO: Leaving a MUC we already occupy.
Dec 13, 2020 8:53:17 PM org.jivesoftware.smack.AbstractXMPPConnection callConnectionAuthenticatedListener
SEVERE: Exception in authenticated listener
java.lang.RuntimeException: org.jivesoftware.smack.XMPPException$XMPPErrorException: XMPP error reply received from [email protected].[redacted]/jitsi-jvb-86cb8c4558-nfl2b: XMPPError: item-not-found - cancel
at org.jitsi.xmpp.mucclient.MucClient$1.authenticated(MucClient.java:287)
at org.jivesoftware.smack.AbstractXMPPConnection.callConnectionAuthenticatedListener(AbstractXMPPConnection.java:1297)
at org.jivesoftware.smack.AbstractXMPPConnection.afterSuccessfulLogin(AbstractXMPPConnection.java:572)
at org.jivesoftware.smack.tcp.XMPPTCPConnection.afterSuccessfulLogin(XMPPTCPConnection.java:379)
at org.jivesoftware.smack.tcp.XMPPTCPConnection.loginInternal(XMPPTCPConnection.java:444)
at org.jivesoftware.smack.AbstractXMPPConnection.login(AbstractXMPPConnection.java:491)
at org.jivesoftware.smack.AbstractXMPPConnection.login(AbstractXMPPConnection.java:448)
at org.jivesoftware.smack.ReconnectionManager$2.run(ReconnectionManager.java:294)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.jivesoftware.smack.XMPPException$XMPPErrorException: XMPP error reply received from [email protected].[redacted]/jitsi-jvb-86cb8c4558-nfl2b: XMPPError: item-not-found - cancel
at org.jivesoftware.smack.XMPPException$XMPPErrorException.ifHasErrorThenThrow(XMPPException.java:132)
at org.jivesoftware.smack.StanzaCollector.nextResultOrThrow(StanzaCollector.java:263)
at org.jivesoftware.smackx.muc.MultiUserChat.enter(MultiUserChat.java:355)
at org.jivesoftware.smackx.muc.MultiUserChat.createOrJoin(MultiUserChat.java:498)
at org.jivesoftware.smackx.muc.MultiUserChat.createOrJoin(MultiUserChat.java:444)
at org.jitsi.xmpp.mucclient.MucClient$MucWrapper.join(MucClient.java:769)
Dec 13, 2020 8:53:27 PM org.jitsi.utils.logging2.LoggerImpl log
INFO: Performed a successful health check in PT0S. Sticky failure: false