urllib3 icon indicating copy to clipboard operation
urllib3 copied to clipboard

Support with-statement on response objects to improve protection against connection leaking

Open ml31415 opened this issue 9 years ago • 13 comments

As already briefly discussed in #805 there is still a scenario for leaking connections from the connection pool. It happens, when a thread opens a response, but then dies or finishes, before reading from it. E.g. it could die while doing some header parsing:

def timed_read(cp, i, timeout, host, port, release=None, preload_content=True):
    resp = cp.urlopen('GET', 'http://%s:%s' % (host, port),
                      release_conn=release,
                      preload_content=preload)
    print "%s opened" % i
    resp.headers['asdfasdf']
    data = resp.data # Never read, never released

And there again is the deadlock on opening connection nr 2:

----------  timeout=5 preload=False release=None ----------
1 opened
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/gevent/greenlet.py", line 327, in run
    result = self._run(*self.args, **self.kwargs)
  File "Schreibtisch/demo3.py", line 47, in timed_read
    resp.headers['asdfasdf']
  File "/usr/local/lib/python2.7/dist-packages/urllib3/_collections.py", line 151, in __getitem__
    val = self._container[key.lower()]
KeyError: 'asdfasdf'
 failed with KeyError

0 opened
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/gevent/greenlet.py", line 327, in run
    result = self._run(*self.args, **self.kwargs)
  File "Schreibtisch/demo3.py", line 47, in timed_read
    resp.headers['asdfasdf']
  File "/usr/local/lib/python2.7/dist-packages/urllib3/_collections.py", line 151, in __getitem__
    val = self._container[key.lower()]
KeyError: 'asdfasdf'
 failed with KeyError

Supporting the with-statement on response objects may help the user to avoid such situations.

ml31415 avatar Mar 01 '16 08:03 ml31415

So there are some complexities here. Firstly, if preload_content=True then you should not need to call resp.data to get the connection released. It should be released in urlopen, because preload_content=True forces us to read all the data from the network before we return the response to you.

So the issue only occurs with preload_content=False. The API does not clearly document this, but if preload_content=False and release_conn=None, or if just release_conn=False, then the user must call release_conn on the connection when they're done with it.

Finally, note that to actually exhaust the connection pool you need to have set the connection pool block parameter to True: otherwise, it'll just manufacture new connections as needed and replace them. That's why leaking connections from urllib3 is generally not disastrous: it just creates temporary connections when more than 50 are needed.

The net of all that makes things somewhat tricky for us, because release_conn doesn't attempt to do much error handling. That means just calling release_conn doesn't help us much because we need to validate that the connection is in a suitable state to release.

The safe thing to do might be to have the context manager check whether it's being terminated because of an exception and, if it is, to also close the response (just to be safe). That way, we know that we've consumed all the data.

The other option is to add a flag on the response that tracks whether we believe all the data was read (e.g. by looking for the termination conditions in HTTPResponse.read(). That way we know whether it's safe to return the connection object to the pool.

@shazow, @sigmavirus24, do you have opinions on which of those we do?

Lukasa avatar Mar 01 '16 10:03 Lukasa

Well, I'm exactly one of these (few) unlucky users who, as you said, use a blocking pool to limit the number of parallel connections to remote hosts and as well not always preload the data. To my bad, all this also runs in a long running service, which had started to regularly hang for unclear reasons, when I exchanged my own throttling against the connection pool with maxsize, in order to reduce some overhead...

Your both suggestions don't seem to be exclusive to me. Why not only return the socket to the queue, if it was completely read (and marked by .read() ) and no error had occured so far?

ml31415 avatar Mar 01 '16 11:03 ml31415

Why not only return the socket to the queue, if it was completely read (and marked by .read() ) and no error had occured so far?

They don't have to be exclusive, but it's often best to think of them that way. For example, if we're already tracking whether all the data has been read, we know that the connection is in a good state and can be re-used, so we should return it unclosed to the pool when the block is exited, even if an error was encountered in user code. That user code almost certainly didn't make the connection invalid, so we should return it.

On the other hand, if we aren't tracking whether users completely read the data, then it's much harder to tell whether the socket is safe to return. It's reasonable to assume that if an exception was hit that forces us to exit the block that the socket is now invalid, because that exception may well have happened before reading the data from the socket. However, if no exception was hit, we could assume that the user obeyed the API and consumed all the data from the connection.

I also prefer the second: it's more obviously "right". However, even it has edge cases if the user works hard enough.

Lukasa avatar Mar 01 '16 11:03 Lukasa

Hmm, I guess it's not avoidable to track, if the data was read completely. As you said, it's hard to safely determine the state otherwise. And if an error occured in user code, it should be usually before the data was read, as there is no point in not leaving the block, after having read the data. So it can mainly be assumed, that if an error happened in user code, the data is not read. But again, it's only an assumption, so the safe way would be to track the read process.

ml31415 avatar Mar 01 '16 11:03 ml31415

Related reading: https://github.com/shazow/urllib3/issues/436

The other option is to add a flag on the response that tracks whether we believe all the data was read

I'm skeptical about how reliable we can be here. I'm getting the feeling we either need to exhaust-read the socket before returning it or when we acquire it, just before making a new request?

We've wanted a context manager for this scenario for a while, so I'm definitely +1 to exploring this if this is a good excuse.

shazow avatar Mar 02 '16 01:03 shazow

Exhaust-reading on acquire may not be a good idea. Imagine a huge download, that was interrupted, and no one cares about the remaining unread data anymore. Better save the time and bandwith and close the socket.

ml31415 avatar Mar 02 '16 02:03 ml31415

Would reading a small amount and closing if data is still coming in work?

shazow avatar Mar 02 '16 04:03 shazow

I'd have to play around a bit to see how that behaves, but the real issue is just that exhausting a socket is altogether too time consuming. I'm inclined to be conservative: if urllib3 believes the response has been exhausted then use it, otherwise we should defensively throw the connection away.

FWIW, we can prototype this and then see what happens in standard use cases by raisin an INFO/WARNING level log if that happens.

Lukasa avatar Mar 02 '16 07:03 Lukasa

Yea, I agree that a full read wouldn't make sense, but a small read -> close if something is returned sounds like a reasonable thing to prototype.

shazow avatar Mar 02 '16 07:03 shazow

@shazow You volunteering to be guinea pig? ;)

Lukasa avatar Mar 02 '16 16:03 Lukasa

In terms of volunteering to build this? Happy to if there's a bounty/funding, otherwise I'm restricting my self-funded urllib3 time to just project management/code reviews/minor fixes for the near future. :)

shazow avatar Mar 02 '16 19:03 shazow

That's totally fair. In that case, I'll give it a shot tomorrow.

Lukasa avatar Mar 02 '16 19:03 Lukasa

🌸

shazow avatar Jun 09 '16 17:06 shazow