gunicorn shutdown() socket before close()

This should prevent problems where writing a bunch of data to the socket before closing it causes data to be lost, because the data was not actually written before the socket and the TCP connection go away.

Fixes #840

For more information, see https://blog.netherlabs.nl/articles/2009/01/18/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable

Jun 08 '20 19:06 erjiang

We already do this : https://github.com/benoitc/gunicorn/blob/master/gunicorn/workers/sync.py#L199

same in async worker. Which worker are you using it? I would not put it there anyway as we don't always want it. Only pass the option where it's needed.

Jun 08 '20 19:06 benoitc

Thanks!

I've very recently tracked down something similar in gevent. Some platforms are extremely sensitive to this issue, others much less so. Adding OpenSSL on top of it makes things even worse.

If there's a chance that sock could be an ssl.SSLSocket, it's important to call sock.unwrap() before calling sock.shutdown() because there's a layer of buffering there too that needs to be shutdown; this is especially true on Windows with Python 3.7+ for whatever reason:

def close(sock):
    if hasattr(sock, 'unwrap'):
        try:
            sock = sock.unwrap()
        except socket.error as e:
            if e.args[0] != 0:
                # Some platforms sometimes produce an exception with errno = 0 here
                # Or maybe that's just a gevent thing.
                raise 
    sock.shutdown(socket.SHUT_RDWR)

Jun 08 '20 19:06 jamadden

We already do this :

That's only in the event of an exception, though. In the regular handle() loop, once we've exhasted all the requests, we seem to just call util.close(sock). https://github.com/benoitc/gunicorn/blob/4bed9e7b192ae211b7d1f6c14065c373c9e1c0aa/gunicorn/workers/sync.py#L159

And there are places where we don't even do that, we just directly close it. https://github.com/benoitc/gunicorn/blob/4bed9e7b192ae211b7d1f6c14065c373c9e1c0aa/gunicorn/workers/sync.py#L141-L144

Jun 08 '20 19:06 jamadden

yes because in theory you don't want to shutdown yourself the socket since we read it until the end... We used to do that (~2010) but for some reason it has been removed.

I need to find the related commit to refresh myself about that.

Jun 08 '20 19:06 benoitc

@jamadden the snippet on SSL is normal the socket is already close there, you only want to destroy the fd (at least in theory;)

More generally speaking I Would prefer to handle the shutdown explicitly in the place we are sure it can be needed. I would accept a patch with that.

note: it's odd anyway the error only happen on Heroku. I would be interrested to know about the internals there that could trigger it.

Jun 08 '20 19:06 benoitc

One reason not to call shutdown (speaking from memory, I may have this completely wrong): By actively sending the FIN packet, it starts timers in intermediate network stacks and can result in a slow client not actually being able to read all the data that's theoretically been sent.

Jun 08 '20 20:06 jamadden

it's odd anyway the error only happen on Heroku. I would be interrested to know about the internals there that could trigger it.

Yeah, me too. When I was dealing with this (just today!) I could reproduce it 100% of the time in a Fedora 33 RawHide docker container, but not at all natively on macOS or in the manylinux2010 docker container; I've seen it intermittently on Ubuntu Xenial on Travis and possibly even on AppVeyor/Windows 10. It seems this has to do with quite low-level kernel networking parameters (buffer sizes?) — but that's on localhost, which behaves very differently than addresses that actually go across the network.

Jun 08 '20 20:06 jamadden

ok I found the place this shutdown has been removed (long time ago) : https://github.com/benoitc/gunicorn/commit/615c6a692703d3f6376bcffca2c0cff147196a0c#diff-0b90f794c3e9742c45bf484505e3db8d

The commit log doesn't reflect the whole change unfortunately. But now if I remember it correctly shutdown is blocking and can take some time which may have been an issue t that time.

Anyway like I said, let be conservative there. @erjiang can you possibly patch gunicorn by adding this shutdown to the place it's needed, instead of this close function? It would be usefull. (also I would add the change from @jamadden too).

Jun 08 '20 20:06 benoitc

The commit log doesn't reflect the whole change unfortunately. But now if I remember it correctly shutdown is blocking and can take some time which may have been an issue t that time.

It makes sense that shutdown is blocking in order to wait for rest of data to be written. From my understanding, without shutdown(), there's no guarantee that the kernel will not simply close the socket and drop any remaining data. It may work 99% of the time, or in "most" configurations, but without doing it, we are saying, "close the socket and don't worry about sending out the rest of the data".

Anyways, I will investigate a bit more to see if there's a better place to call shutdown(), but it's not clear to me in which situations we would not want it. Maybe only error cases don't need shutdown()?

Jun 08 '20 20:06 erjiang

Anyways, I will investigate a bit more to see if there's a better place to call shutdown(), but it's not clear to me in which situations we would not want it. Maybe only error cases don't need shutdown()?

Yes it normally should be enough. I will investigate it tomorrow also. (it's late there)

Jun 08 '20 20:06 benoitc

One potential trigger of the issue is not fully reading the socket before the socket is closed. That would explain why it's so easy to trigger the issue in the sample code I posted - handling a POST in Flask without reading the POST body seems to trigger this problem. The relevant RFCs state that if there's data remaining in the recv buffer when close() is called, then RST should be sent instead of FIN.

Anyways, it seems like shutdown() is always needed. The only scenario that I can think of to not shutdown() is if there are multiple users of the socket, which I think you acknowledge in your old commit as "(trick from twisted)". But it doesn't look like the current code does that in sync.py, base_async.py, or gthread.py.

I updated my PR to include the sock.unwrap call, but left the code in util.py.

Jun 08 '20 22:06 erjiang

Hi! What are the next steps for this PR? :-)

Jun 25 '20 14:06 edmorley

There are some places where we call shutdown and close in the worker code. Should we change these to call this utility?

Jun 26 '20 15:06 tilgovi

Never mind. I think we already do. I see a cleanup, potentially, but I'll investigate that after the merge.

Jun 26 '20 15:06 tilgovi

Never mind. I think we already do. I see a cleanup, potentially, but I'll investigate that after the merge.

care to elaborate? In it's current state I wouldn't approve this change since it's not explicitely shutdown the socket where it needs to. We don't want to block a worker. Taking my notes from that old time, the reasoning was that the application should take care about making sure the response is fully read before releasing the request. (or we finish the iteration).

The PEP 3333 says:

When writes are done during the execution of an application object, the application can ensure that resources are released using a try/finally block. But, if the application returns an iterable, any resources used will not be released until the iterable is garbage collected. The close() idiom allows an application to release critical resources at the end of a request [..]

In any ccase make the closing asynchronous.

Jul 02 '20 07:07 benoitc

Revisiting this old PR. It looks like we currently do shutdown only when an exception gets raised. I'm guessing that's to ensure that headers get sent.

What's the expectation we want for applications? Do they need to somehow flush their own writes or should we be calling shutdown for them on a clean request exit?

Dec 28 '23 00:12 tilgovi

no activity since awhile. closing feel free to create a new ticket if needed.

Aug 06 '24 16:08 benoitc

gunicorn gunicorn copied to clipboard

shutdown() socket before close()

gunicorn
gunicorn copied to clipboard