python-memcached icon indicating copy to clipboard operation
python-memcached copied to clipboard

Added quit() method.

Open userrl opened this issue 12 years ago • 3 comments

Simply closing a connected socket leaves the local socket in TIME_WAIT while the OS waits to ensure that the remote end isn't going to transmit anything further. Memcached supports the 'quit' command, which tells the remote server to initiate the connection close instead. This pull request adds a quit() method to _Host that sends a quit command, waits for the remote end to close the connection, then closes the local socket (leaving behind nothing for the OS to clean up). The quit method on Client simply calls quit() on each of the connected servers (wasn't sure whether to name it quit() or quit_all(), so went with the simpler of the two).

userrl avatar Aug 29 '13 21:08 userrl

I called self.send_cmd() to send the 'quit' command instead of calling self.socket.sendall(). The send_cmd() function automatically adds the '\r\n'.

The self.socket.recv() is intentional, due to how things work behind the scenes. Whichever end closes the connection first sends a FIN and puts the socket into TIME_WAIT since it might receive delayed data on the socket that the remote end sent before it received the FIN. The other end receives the FIN, sends an ACK, and the OS immediately tears down the connection and puts the socket in CLOSE_WAIT because it knows it won't get anything after the FIN (the socket remains in CLOSE_WAIT until the program that opened it acknowledges the closure by closing the file descriptor, at which point the OS sends its own FIN).

If we send a 'quit' command and immediately close the socket (as your echo | nc example does), then both ends may think they're the first to close the socket and both can end up in TIME_WAIT. Instead, we have to send the 'quit' command, wait for the remote server to process it, and then close the connection. We don't disable socket blocking, so socket.recv() blocks until it gets data; when the OS receives a FIN and closes the connection, the socket library stops blocking and returns an empty string. The number 1 is meaningless - we could wait for 1 byte or 1024 bytes of data; since we're not expecting anything, what this line really means is "wait until the socket closes from the other end". Once that happens, we know that we're in CLOSE_WAIT, so when we close the socket in the next line, the OS should truly and completely remove the socket descriptor.

Here's a step-by-step to show that the local socket really does close:

Set up connection

$ memcached &
$ ipython2
In [1]: import memcache

In [2]: mc = memcache.Client(['localhost'], 1)

Request something so we actually connect, don't care about the results. Now we show the server and the client connections are active.

In [3]: mc.get_stats() == None
Out[3]: False

# netstat -ntp | grep 11211
tcp        0      0 127.0.0.1:11211         127.0.0.1:47722         ESTABLISHED 27199/memcached
tcp        0      0 127.0.0.1:47722         127.0.0.1:11211         ESTABLISHED 27778/python2

Send a quit command. I can't switch consoles and re-run the netstat command in the few milliseconds it takes for the command to be processed, so the state after step 4 and 5 looks identical. Note that the "server" (which closed the connection first) is sitting in FIN_WAIT2 - it's sent a FIN and got an ACK, but hasn't received the FIN from the "client" yet.

In [4]: mc.servers[0].send_cmd('quit')

# netstat -ntp | grep 11211
tcp        0      0 127.0.0.1:11211         127.0.0.1:47722         FIN_WAIT2   -
tcp        1      0 127.0.0.1:47722         127.0.0.1:11211         CLOSE_WAIT  27778/python2

Proof that the recv() returns an empty string on a dead connection.

In [5]: mc.servers[0].socket.recv(1)
Out[5]: ''

# netstat -ntp | grep 11211
tcp        0      0 127.0.0.1:11211         127.0.0.1:47722         FIN_WAIT2   -
tcp        0      0 127.0.0.1:47722         127.0.0.1:11211         CLOSE_WAIT  27778/python2

Once the socket is closed, the "client" (ipython2) sends its FIN, the memcached server replies with an ACK, and only the server-side TIME_WAIT remains.

In [6]: mc.servers[0].close_socket()

# netstat -ntp | grep 11211
tcp        0      0 127.0.0.1:11211         127.0.0.1:47722         TIME_WAIT   -

userrl avatar Sep 03 '13 23:09 userrl

If this is Standard Operating Procedure for closing a socket, why isn't this in the socket library? I've never seen this recipe before.

linsomniac avatar Sep 04 '13 04:09 linsomniac

I don't know that it's Standard Operating Procedure, it's just the only operating procedure that I could find. Google "detect closed socket python" and you'll find quite a few discussions about how to detect when the remote end has closed a socket connection. There's quite a varied list of suggestions, but in the end they all seem to boil down to "listen on the socket and detect a zero-byte reply". I'm by no means an expert on socket communications - I hacked this together this solution with a lot of googling and trial and error, so if you find a better way, please let me know.

userrl avatar Sep 06 '13 20:09 userrl

Thank you for the PR, I have merged it.

linsomniac avatar Apr 16 '23 16:04 linsomniac