pyrax
pyrax copied to clipboard
Getting error: Connection pool is full, discarding connection
I just upgraded from python-cloudfiles to pyrax to get rid of the deprecation messages, i ended up with another error. I googled it and found this: https://community.rackspace.com/developers/f/7/t/3412 But I cant see how that helps me since the "error" seems to be in the swiftclient library (which in its turn uses the requests library where the error is raised)
The error occurs when trying to upload files using container.upload_file() or container.store_object()
filename 'connectionpool.py' lineno 248 message 'Connection pool is full, discarding connection: storage101.lon3.clouddrive.com' pathname '/redacted/env/local/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py'
Hmmm... are you uploading many, many files before this happens? Otherwise I don't see why you should be getting that error. Can you tell me more about what your script is doing when you run into this error? Also, can you post the output from pip freeze so I can see the various versions of modules that are installed?
Actually, I'm getting this too. It is not an exception, though, just the swiftclient writing in the log that message. What my script does is uploading a sequence of files, one after another. This error appears with the very first upload and then again for each consequent. It doesn't disrupt the upload process though.
Like @bazzilic says, it's a warning. But the warning floods the logs when doing a lot of uploads. We're uploading 2500 or so images per day, sometimes 4 in parallel (x3 machines).
@EdLeafe code and pip freeze below:
import os
import pyrax
def upload_screenshot(filename, data, mime='image/png'):
pyrax.set_setting("identity_type", "rackspace")
pyrax.set_credential_file(os.path.expanduser("~/.credentials_file"))
conn = pyrax.cloudfiles
cont = conn.get_container(container)
obj = cont.store_object(obj_name=filename, data=data, content_type=mime)
return obj
# pip freeze:
Babel==1.3
BeautifulSoup==3.2.1
Django==1.5
Fabric==1.4.3
PyXML==0.8.4
adspygoogle==1.1.8
adspygoogle.adwords==15.9.1
amqp==1.4.4
amqplib==1.0.2
anyjson==0.3.3
argparse==1.2.1
backports.ssl-match-hostname==3.4.0.2
billiard==3.3.0.16
celery==3.1.10
distribute==0.6.34
django-bootstrap-pagination==0.1.10
django-celery==3.1.10
django-extensions==1.0.2
django-model-utils==1.2.0
django-reversion==1.6.4
django-tastypie==0.9.14
django-tastypie-swagger==0.0.3
epydoc==3.0.1
flower==0.6.0
fpconst==0.7.2
gevent==1.0
google-api-python-client==1.2
greenlet==0.4.2
gunicorn==0.17.2
httplib2==0.8
ipython==0.13.1
iso8601==0.1.10
keyring==3.8
kombu==3.0.14
mimeparse==0.1.3
mock==1.0.1
os-diskconfig-python-novaclient-ext==0.1.2
os-networksv2-python-novaclient-ext==0.21
os-virtual-interfacesv2-python-novaclient-ext==0.15
pbr==0.8.0
prettytable==0.7.2
psycopg2==2.4.5
pycrypto==2.6
pygeoip==0.2.5
pyrax==1.8.1
python-cloudfiles==1.7.10
python-dateutil==2.1
python-memcached==1.48
python-novaclient==2.17.0
python-swiftclient==2.0.3
pytz==2014.2
rackspace-auth-openstack==1.3
rackspace-novaclient==1.4
raven==4.2.3
rax-default-network-flags-python-novaclient-ext==0.2.4
rax-scheduled-images-python-novaclient-ext==0.2.1
redis==2.6.2
requests==2.3.0
selenium==2.41.0
simplejson==3.4.1
six==1.6.1
ssh==1.7.14
suds==0.3.8
tornado==3.2
wsgiref==0.1.2
Hi, My office is the one that wrote the thread on the Rackspace developer forums. We are still having the same issue when uploading any file of any size to Rackspace. I wr-ote a test script which does nothing but take an 8 KB text file, get the checksum, and then perform upload_file on that test file. The file upload correctly (as do all of the uploads we perform that throw warnings), but it still throws the warning. As mentioned by others, it's more of a nuisance than anything but it does flood our logs daily. Again, this happens even just uploading a single file.
Has anyone made any progress on this? I've checked and the pool_connections and pool_maxsize are both set to 10 when the upload is occurring.
Do you also create a new connection with each upload? Unless you're going for parallelism, you should auth and get a reference to the pyrax.cloudfiles client at the beginning of a script, and then use that client for all of your uploads.
Here's the entirety of the management command I wrote to reproduce the error. The text file that is uploaded is just an 8KB file of alphanumeric characters.
class Command(BaseCommand):
def handle(self, *args, **options):
username = "redacted"
api_key = "redacted"
pyrax.set_setting("identity_type", "rackspace")
pyrax.set_credentials(username=username, api_key=api_key)
cf = pyrax.cloudfiles
path1 = "/Users/redacted/test/test1.txt"
chksum1 = pyrax.utils.get_checksum(path1)
cf.upload_file("Testing", path1, etag=chksum1)
This single upload causes the warning to be thrown. Maybe I am misunderstanding the question, but all this is doing is authenticating once and then attempting a single upload.
@meferguson84 OK, I thought you were going through a bunch of files and creating a client for each file.
I've googled around, and it seems to be something in urllib3's connection pooling. You can see it here in the _put_conn() method as the connection is returned to the pool.
I don't understand how a single file upload would cause that to happen, though - unless the pool size was zero!
To be clear, we normally do upload a lot of files at random times. However, this particular script doesn't do that. I went into debug mode and stepped through to the actual upload point and checked the HTTP connection. As I mentioned previously, _pool_connections and _pool_maxsize are both set to 10 at the time of upload. We are a bit perplexed as well as to how this could be throwing warnings when it's a single connection with a pool maxsize of 10.
Yep, I am having exactly same issue. Environment is Ubuntu Server 12.04 LTS on a rackspace cloud server.
OK, I dug a little deeper. The reason I'm not seeing these messages is because the logging in the requests module on my system doesn't have any handlers configure for it. I'm running a virtualenv with requests installed there, and the file in question is "~/.virtualenvs/pyrax/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py". I added some debug output to the _put_conn() method in lines 238-9, so that it reads as follows:
224 def _put_conn(self, conn):
225 """
226 Put a connection back into the pool.
227
228 :param conn:
229 Connection object for the current host and port as returned by
230 :meth:`._new_conn` or :meth:`._get_conn`.
231
232 If the pool is already full, the connection is closed and discarded
233 because we exceeded maxsize. If connections are discarded frequently,
234 then maxsize should be increased.
235
236 If the pool is closed, then the connection will be closed and discarded.
237 """
238 print "Handlers", log.handlers
239 print "Level", log.level
240 try:
241 self.pool.put(conn, block=False)
242 return # Everything is dandy, done.
243 except AttributeError:
244 # self.pool is None.
245 pass
246 except Full:
247 # This should never happen if self.block == True
248 log.warning(
249 "Connection pool is full, discarding connection: %s" %
250 self.host)
251
252 # Connection never got put back into the pool, close it.
253 if conn:
254 conn.close()
My thinking is that something in your systems is configuring logging for requests, and if so, you might be able to disable it by setting the level to logging.CRITICAL or something similar.
@EdLeafe I added the same code to test it. The level returned was 0 which I assume to mean there is no level set, but the handlers returned an empty list so it appears as if we have no handlers configured as well. Our code is straight from pip (we have done no modifications to those site-packages). Do you have any suggestions for how we could configure it to disregard warnings for this particular circumstance only?
@meferguson84 I don't see how you could be getting warnings in your logs if the log object doesn't have any handlers. At most all you should see is the error message No handlers could be found for logger "connectionpool".
@EdLeafe Here's a screen shot of the global 'log' object at the time the warning is being written to the console:
The global 'logging' module has handlers defined, but not the global 'log' object.
@EdLeafe So I did some more digging, and I wanted to check with you to see if this is normal behavior. I stepped into the code to the point where it mounts the adapter just to play with the pool size and see if it made a difference. What I found was that the queue is full of NoneType objects with the actual upload connection being the last item in the list. The list is 10 items long (which makes sense). What doesn't make sense is that the unfinished_tasks parameter for the pool is 11. How can this be when the queue itself is only 11 items? Also, is it normal for the queue to be full of NoneType objects with the connection we are using being the last item on the list? Screenshot below for reference.

@meferguson84 Ah, that's very interesting. I've spent a ton of time fighting the swiftclient behavior regarding connections, which sometimes leaves them open, and other times closes them unexpectedly. I'm wondering if their recent change to using requests has improved that. Can you try modifying the _close_swiftclient_conn() method of pyrax/cf_wrapper/client.py to simply return without doing anything, and re-run your tests to see if all of those None objects are in the queue?
@EdLeafe I commented out what was there and had the method return and the None objects were still in the queue.
@EdLeafe Do you believe this is a Pyrax issue or a requests library issue? I can post on the requests Git if you think that might be helpful.
@meferguson84 OK, I pored through the requests code, and it looks like None objects are normal. If _get_conn() gets None from the pool, it simply creates a new connection. It seems odd, though, that it should start with all those None objects, and that _put_conn() isn't smart enough to replace None with the connection.
I've also noticed that when the swiftclient connection's put_object() code runs, requests first gets a connection from the pool, and when the response is received, puts it back by calling _put_conn(). All is well. But then put_object() calls resp.read(), and that method then releases the connection, which calls _put_conn() a second time. I'm not sure how to fix this; it seems that this is a bug in swiftclient and how it works with requests.
Did anyone manage to solve this problem? I am trying to retrive data from different directories recorded in db and sync them. See code:
container = self.cf.get_container(app.container)
for source in Sources.select().where(Sources.status == 'ACTIVE'):
local = source.path
self.cf.sync_folder_to_container(local, container)
Is there any concern calling multiple sync_folder_to_container(local, container) ?
@felixcheruiyot It seems to be a bug in swiftclient. I'm working on replacing the dependence on swiftclient, but that won't be ready for a few weeks, as it's a major change and it needs thorough testing.
@EdLeafe Thanks for checking this out. I am going to try to figure out when this was introduced to swiftclient, and pin it at that version.
@EdLeafe Thanks for working on a fix. It's much appreciated.
Thanks guys for the move.
Looks like there's some movement on the bug in python-swiftclient regarding this issue: https://bugs.launchpad.net/python-swiftclient/+bug/1295812
And now a patch, last updated September 4th:
https://review.openstack.org/#/c/116065/5