websocketpp How to stop server correctly?

websocket reports "Address already in use" when I try to recreate server

I use the following code to stop server:

    m_endpoint.stop_listening();
    m_endpoint.stop();

is it right?

Mar 27 '19 07:03 Rowandjj

https://github.com/zaphoyd/websocketpp/issues/704

same problem here...

Mar 27 '19 07:03 Rowandjj

#704

same problem here...

I don't know it for sure (I never need to shutdown the endpoint) but faq seems to answer your question:

For both, run websocketpp::endpoint::close or websocketpp::connection::close on all currently outstanding connections.

So the sequence for you seems to be:

m_endpoint.stop_listening();
m_endpoint.close();

Mar 31 '19 20:03 ceztko

As @ceztko noted, there are detailed instructions on how to properly shut down a server in the FAQ (https://docs.websocketpp.org/faq.html) under "How do I cleanly exit an Asio transport based program".

In short: calling stop_listening() on your endpoint to stop new connections from being accepted then either wait until all existing connections close, or manually close them with connection::close(). Note that that WebSocket++ endpoints do not keep track of all outstanding connections, so there is no single command you can run to close all connections. You will need to keep track of the list of outstanding connections yourself and iterate over that list calling connection::close.

If you want a clean close, it is important that you do not call endpoint::stop() or the stop() method on the underlying Asio io_service. See the warning note in the FAQ entry for more details.

Mar 31 '19 20:03 zaphoyd

You will need to keep track of the list of outstanding connections yourself and iterate over that list calling connection::close.

This is something I was actually doing myself to handle CTRL+C and daemon shutdown.

Note that that WebSocket++ endpoints do not keep track of all outstanding connections

Having said that keeping track of own connections and call close on them voluntarily is the best design, how is that even possible that endpoints don't track them? Each connection of course will have a state machine that endpoints must definitely keep track of.

Mar 31 '19 21:03 ceztko

WebSocket++ core was explicitly designed to work without a central registry of connections that could act as a bottleneck to scaling. Endpoints bootstrap and launch connections, after which they handle their own state machine, including cleaning up after themselves, without the endpoint ever communicating with them again. Connections are created, operated, and cleaned up in constant time no matter how many connections are outstanding.

Transports and memory allocation policies may introduce resources that can be shared across all connections like message pools or the Asio io_service. If those policies are in use then you trade the scaling benefits for the benefits of sharing. Pay for what you use and such.

Mar 31 '19 21:03 zaphoyd

m_endpoint.stop_perpetual(); m_endpoint.stop(); m_endpoint->join();

Apr 28 '21 09:04 dq5070410

By default, you have to wait 100s (LINUX), or 240s (Windows/BSD) before you can start listening again on the same socket. This is the by-design behavior of TCP/IP! (But wait for it...)

Correctly closed TCP/IP sockets enter TIME_WAIT state in the TCP/IP state diagram which lasts for 100 seconds (LINUX), or 240 seconds on some platforms (BSD, Windows).

However, while a closed or closing socket lingers in TIME_WAIT or any of the CLOSING states, you cannot open a new listening port at the receiving address of the lingering socket.

The "address in use" error comes from the fact that the end-point binding of your listening socket is :80/:, and the end-point binding of the TIME_WAIT socket will be a bound version of that. e.g. 192.168.0.11:80/192.168.0.25:52312. Because the end-point bindings overlap, the "address is in use" until the lingering socket finally closes.

Ironically, abnormally terminated connections linger for half that long.

So by default, you cannot re-bind to the same port for 100 to 240s following the exit of the previous instance of your process.

The LINUX workaround is to set the SO_REUSE_ADDRESS socket option on the new accepting socket. In ASIO, that would be:

boost::asio::socket_base::reuse_address option(true);
acceptor_.set_option(option);

This produces the behavior you would naively expect. You STILL get an address in use error if there is a socket actively listening on the same port; but you don't get an error if there are lingering client connections. The risk: there is a once-in-the-lifetime of the universe sort of risk that a client or server may end up confusing a packet meant for an old connection for one meant for a new connection (or vice versa). Specifically, some retried packet from the old connection has to spend 50 (or 120) seconds wandering around the Internet before arriving while a connection gets opened on the new socket. AND the sequence numbers and client port number have to match! Not going to happen. .

Unfortunately, this fix introduce a security risk on Windows platforms, where the reuse_address option does what you would naively expect it to do (and probably what BSD/LINUX used to do before somebody decided to plug a gaping security issue: it allows other processes to open a listening socket on the SAME socket while yours is open! Which process actually gets a particular connection request is arbitrary. (Presumably, both sockets get the request, and the client probably accepts whichever ACK arrives first)!

To get the desired behaviour, Windows has another socket option. SO_EXCLUSIVEADDRUSE, that specifies that when SO_REUSE_ADDRESS is also specified, other processes cannot open a second listening socket. You have to specify two flags to replicate the LINUX behavior, one of them only available on Windows. I'm not sure how one specifies the platform-specific flag to do this on boost::asio (or whether bost::asio pre-emptively fixes the issue, since there is no conceivable cased in which you would not want the SO_EXCLIVEADDRUSE behavior on a TCP connection).

Quoting the Windows SDK documentation:

Before the SO_EXCLUSIVEADDRUSE socket option was introduced, there was very little a network application developer could do to prevent a malicious program from binding to the port on which the network application had its own sockets bound. In order to address this security issue, Windows Sockets introduced the SO_EXCLUSIVEADDRUSE socket option, which became available on Windows NT 4.0 with Service Pack 4 (SP4) and later.

One assumes, that somewhere around the NT 4.0 timeframe, the very little network application developers could do was to convinced LINUX kernel developers to fix the shared-listening-port problem; while NT got stuck emulating the pre-fix UNIX behavior. (And AT&T Unix users just had to live with the fact that their webservers had to retry the listen call for 240 seconds on restart). Remember, this is back in the dark ages of the Internet, when dragons still roamed the wild.

Go ahead and use the fix on LINUX. It's safe. But if you must use it on windows, you will need to check that a second listening socket on the same address cannot be used. And decide whether the actual security risk is any worse than some evil program just leaping on the listening address while your service is restarting. Which is pretty bad too.

Sep 04 '21 10:09 rerdavies