Authorization failure 31 hours into an upload_file call
Partway through uploading a 1.7 TB file using container.upload_file, I got the following traceback crash:
Traceback (most recent call last):
File "/global/homes/g/girving/pentago/web/to-rackspace", line 110, in <module>
upload_all(paths)
File "/global/homes/g/girving/pentago/web/to-rackspace", line 106, in upload_all
upload(container,name,path)
File "/global/homes/g/girving/pentago/web/to-rackspace", line 78, in upload
obj = container.upload_file(path,name,etag=etag)
File "/global/homes/g/girving/data/lib/python2.7/site-packages/pyrax-1.6.2-py2.7.egg/pyrax/cf_wrapper/container.py", line 168, in upload_file
content_length=content_length)
File "/global/homes/g/girving/data/lib/python2.7/site-packages/pyrax-1.6.2-py2.7.egg/pyrax/cf_wrapper/client.py", line 70, in _wrapped
ret = fnc(self, *args, **kwargs)
File "/global/homes/g/girving/data/lib/python2.7/site-packages/pyrax-1.6.2-py2.7.egg/pyrax/cf_wrapper/client.py", line 821, in upload_file
upload(ff, content_type, etag, headers)
File "/global/homes/g/girving/data/lib/python2.7/site-packages/pyrax-1.6.2-py2.7.egg/pyrax/cf_wrapper/client.py", line 787, in upload
response_dict=extra_info)
File "/global/homes/g/girving/data/lib/python2.7/site-packages/python_swiftclient-1.8.0-py2.7.egg/swiftclient/client.py", line 1233, in put_object
response_dict=response_dict)
File "/global/homes/g/girving/data/lib/python2.7/site-packages/python_swiftclient-1.8.0-py2.7.egg/swiftclient/client.py", line 1110, in _retry
rv = func(self.url, self.token, *args, **kwargs)
File "/global/homes/g/girving/data/lib/python2.7/site-packages/python_swiftclient-1.8.0-py2.7.egg/swiftclient/client.py", line 922, in put_object
http_response_content=body)
swiftclient.exceptions.ClientException: Object PUT failed: https://storage101.iad3.clouddrive.com:443/v1/MossoCloudFS_d826dbcf-478e-4780-af9b-b4b987302246/pentago-edison-all/slice-18.pentago.285 401 Unauthorized [first 60 chars of response] <html><h1>Unauthorized</h1><p>This server could not verify t
When it crashed, it had uploaded 1.4 TB worth of chucks over 31 hours.
Unfortunately, I now have to write some extra code to upload only the missing chunks (and then the manifest manually), so I'm unlikely to try to reproduce this with the same basic pyrax function. Therefore, please feel free to close it immediately unless something obvious comes to mind that might help others in future.
Although side-stepping the issue, which needs to be looked at and addressed, is it plausible to split the file into smaller ones, say, into 4 x ~450 Gb each?
I could do that, but I doubt that's the problem. 1.4 TB seems an unlikely limit. I expect that if I upload the chunks myself and then do the manifest myself it'll work fine.
@girving Oh I agree. I don't think file size is the issue. But the auth tokens expire ~24 hours and for some reason it's not getting refreshed at some point. My suggestion above is a work-around, again, trying to side-step the issue and get you out of the bind.
The line in pyrax that failed is simply the one that uploads the manifest. In other words, the entire file has been uploaded. All you need to do is re-create the manifest and upload that.
I am concerned that it failed because of authentication - that's clearly not correct behavior.
No, the entire file hasn't been uploaded. It was supposed to be around 350 chunks, and only 284 were uploaded (and the last chunk is still exactly 5 GB).
Then there's an even bigger problem. The call to upload the manifest happens only after all the segments have been uploaded.
Let me know if you want me to run any further experiments to diagnose the problem, though I'm not sure if any are practical.
My full script is here, though it's really just an overly large wrapper around upload_file call in the end:
https://github.com/girving/pentago/blob/31b369775d18174babc5e2153af28acbc04d0ee7/web/to-rackspace
My .pyrax.cfg file is
[rackspace_cloud]
identity_type = rackspace
username = pentago
api_key = <hex>
How are you authenticating initially? I'm asking because the ~/.pyrax.cfg file is for configuration settings, not your credentials. If you want to store your credentials in a file, it goes in a separate file named whatever you like, since it contains sensitive data.
If I had to guess, the token is expiring and pyrax isn't refreshing the auth token on these types of errors.
@girving - Are the files usually pretty large? Would you be able to upload the segments concurrently? There's another tool named swiftly, which is unofficial, but it can upload the segments concurrently and create the manifest. It's written by an OpenStack Swift/CloudFiles developer.
At a client level, this is how I use it:
# test is the container here, path is the key/object/filename
# I'm splitting files at 1024 bytes just as a demo of chunking
swiftly -v put -s 1024 -i /tmp/bigfile /test/path
If you're interested, I could write some code to use swiftly to upload objects for you. It would require both swiftly and eventlet.
If I had to guess, the token is expiring and pyrax isn't refreshing the auth token on these types of errors.
That shouldn't happen. All 401s are caught, and credentials are re-sent in order to get a new token. If it fails a second time, that means that the credentials aren't valid.
I'm authenticating using the .pyrax.cfg file: <hex> above is actually the hexadecimal api key.
@rgbkrk: Thanks for the link. Unfortunately, it doesn't look like the swiftly client has an option to not upload chunks that already exist, so I can't use it in this case.
I'm concerned that you're confusing the configuration file with the credential file.
So you're passing your configuration file to the set_credential_file() call?
Yes, I am passing ~/.pyrax.cfg to set_credential_file, but my ~/.pyrax.cfg has permissions 600, so I believe this is fine (though apparently not standard practice).