loxone-java
loxone-java copied to clipboard
App fails to recover a websocket connection after reconnecting
Hello, I am using the websocket connectivity to receive statistics and once in a while (every ~2-3 days) the websocket connection gets closed by the miniserver. The library tries a few times to reconnect but it fails due to Connection or authentication failed.
After a few attempts it is giving up leaving the connection broken. Only manual docker container restart helps and the app logs in correctly again and all works. This suggests that credentials must be correct.
More detailed logs:
Any idea what can be wrong?
Hello, thanks for reporting - first please try to setup logging on DEBUG or even TRACE to gather more info
I enabled trace logs for cz.smarteon and will provide them once the case happens again.
All right, here it is.
Note that after I started the app again, it connected and all works.
Is there anything I can provide to help with the investigation?
Well, after some investigation it seems after disconnect when trying to reauthenticate for first time it seems there is some race condition and the session key is created twice, which is probably the reason of subsequent failures. We need more time to investigate as the code is quite complex in terms of thread safety.
In the meantime you could hook-on your own LoxoneWebSocketListener and in case of remote closed throw away the instance of Loxone class (do not forget to call stop first) and create new one.
Also would be nice to find out why your miniserver is disconnecting? For instance is the KEEP ALIVE sending working correctly? Have you tried to just manually restart miniserver during app operation and see whether this issue occured?
I improved the unit test in #215 which is probably spotting the issue
The app reconnects normally if I deploy a new config and also if I restart the miniserver through the Loxone app. Good question, why does miniserver keep disconnecting... I do not know.
As a temporary solution I set up a cron that restarts the container every night. Not ideal but works.
How do I check the keep-alive?
On the debug level you should see Sending websocket message: keepalive it should occur every 4th minute
One more thing comes to my mind - may be it's the token expiration - do you use token autorefresh function? If not check the LoxoneAuth class API
I see keepalive messages so that is ok. Token refresh was disabled so I enabled it and deployed it. Let's see. I will provide some feedback later on.
Did you gather more information since then?
It stopped happening after the token refresh.