SignalR-Client-Cpp icon indicating copy to clipboard operation
SignalR-Client-Cpp copied to clipboard

hub_connection::stop doesn't work reliably at end of process-life

Open ChristophAlbert opened this issue 8 years ago • 2 comments

I have an application which, at the end of it's lifetime, calls hub_connection::stop(), waits for the returned task to finish, and then exits the process.

It turned out that this doesn't work reliably - sometimes, the process exits without having called abort on the server. For an unsecured SignalR connection, abort is called most of the time (but not always); for secure SignalR connections, abort is usually not called (but sometimes).

I looked a little into the issue and I think it is because connection_impl::shutdown starts two tasks: request_sender::abort and m_transport::disconnect. Only the latter task is returned from the method and ultimately waited upon, while request_sender::abort is fire-and-forget.

This seems to explain why the issue is more prevalent with TLS connections: In order to call abort, a new TLS handshake is performed, which isn't finished within the process' lifetime.

As a fix, I'd suggest wrapping both tasks in a task_group and returning that. In this way, the client can still fire-and-forget-stop the connection, but also wait for it synchronously and reliably:

        auto abort_task = request_sender::abort(*m_web_request_factory, m_base_url, m_transport->get_transport_type(), m_connection_token,
            m_connection_data, m_query_string, m_signalr_client_config)
            .then([](pplx::task<utility::string_t> abort_task)
            {
                try
                {
                    abort_task.get();
                }
                catch (...)
                {
                    // We don't care about the result and even if the request failed there is not much we can do. We do
                    // need to observe the exception though to prevent from a crash due to unobserved exception exception.
                }
            });

        auto disconnect_task = m_transport->disconnect();

        return pplx::task<void>([abort_task, disconnect_task]()
        {
          pplx::task_group task_group;
          task_group.run([abort_task](){abort_task.wait(); });
          task_group.run([disconnect_task](){disconnect_task.wait(); });
          task_group.wait();
        });

ChristophAlbert avatar Feb 07 '17 10:02 ChristophAlbert

Interesting. Seems like stop() should wait for abort to finish/fail (and ignore the result) but dtor should be fire, forget and hope it gets through. The server will timeout inactive connections but it would be best not to rely on this behavior if not necessary - it's mostly a fallback for clients that suddenly disappear.

moozzyk avatar Feb 07 '17 17:02 moozzyk

Yes, the server will notice the connection is gone when the heartbeat is missing, but that takes about 50s (iirc), which is slightly inconvenient for automated tests. The behavior should be as you said.

ChristophAlbert avatar Feb 09 '17 08:02 ChristophAlbert