libqatemcontrol
libqatemcontrol copied to clipboard
Connection handling for dropout/bad network
Hi, I'm wondering if it would be a good feature for the library to internally and automatically try to reestablish a connection to the switcher in the event of network failure, unplugged cabling or a power cut, etc.
I have built my own timer based watchdog wrapped around this library for my application but am thinking perhaps it would be better to be self contained within this library? I can't imagine many users of these switchers not wanting a solid, reliable connection.
Please let me know if I have missed something in the source code, but I could not see any mechanisms like this already in place.
I noticed that I was receiving the following in my connection watchdog code and found it was coming internally from the library: QNativeSocketEngine::bind() was not called in QAbstractSocket::UnconnectedState
The call to bind() is in QAtemConnection::connectToSwitcher.
I have changed this to the following:
m_socket->bind(m_port, QUdpSocket::ShareAddress|QUdpSocket::ReuseAddressHint);
This seems to have fixed the warning, and also my reconnection issues when multiple calls to connectToSwitcher are called.
I'm not sure if this is the correct strategy but I now have a persistent connection to the ATEM regardless of poor network quality or hardware failure - just keeping you in the loop in case others have this issue.
Hi, sorry for the very late reply. :( There already is a built in timer but it does try to reconnect on its own instead it emits a disconnected signal. So your app only need to connect to that signal instead of implementing its own timer. I'll add support for changing the timeout but currently it's set to 1 second.
Thanks for the response. I could only find m_connectionTimer, which every 1 second if no data has been received calls handleConnectionTimeout, which emits disconnected. Within the library I can't see anything which connects disconnected back to any reconnection logic. When handleError or handleConnectionTimeout is called, m_socket is closed. I can't see anything that re-opens it or calls connectToSwitcher again. If I am mistaken please point out where/how this is happening.
Does my application need to connect disconnected back to connectToSwitcher? I am unable to re-establish a connection to the switcher if I don't do this in my application. I also can't immediately connect disconnected back to connectToSwitcher, this only seems to work after a 2 second timeout using my own timer.
I also found a related issue which masks some of this connection behaviour. When multiple calls to the API are called when disconnected, it seems there is a big backlog of messages waiting to be sent down the socket. In my application code, I check the connection state and discard the message if currently disconnected. In my testing this seems to allow for a faster reconnection as there isn't a "backlog" of messages trying to swamp the switcher. I am running on a weak raspberry pi system so this may not even be an issue for most users!
Thanks again!
Hi, yes you would have to reconnect on your own when the disconnected signal is emited. I'll see if I can find a solution for you backlog problem.
I commited a possible fix for the backlog issue you got let me know if it fixed your problem.
If I read the code correctly, the way it is written, I cannot be connected to the switcher for more then 1 second at a time? This seems like kind of a silly limitation. Isn't it not good to be opening and closing connections that often?
No, the connection will stay up as the switcher will ping the client which will refresh the timer.
From what I've gathered with these switchers, it seems they continually ping all clients regardless of control signals to the switcher in a ping/pong protocol. I think this is why they have a hard limit of the number of connected clients - Skaarhoj has some figures of around 7 clients for a TVS and around 10 for the Production Studio models. It took me a while at first to wrap my head around this library, but when QAtemConnection receives any socket data from a switcher, it's internal connection timer is reinitialised and started again. If the timer eventually runs out after X seconds, ie. hasn't received a ping for X seconds, the disconnect signal is thrown. What the library doesn't do is try to reconnect automatically, it is left up to you to connect the disconnected signal back to connectToSwitcher.
My question then is why was the library not detecting any response from my switcher and disconnecting? When I last tried using this library I could get it to connect and adjust the AUX channels like I needed to, but it would always time out after 1 second. I'm running the original 1 M/E
Could you create a pcap in wireshark or similar of a failed connection attemp?.