amazon-glacier-cmd-interface
amazon-glacier-cmd-interface copied to clipboard
Download Fails with IncompleteRead Error
Trying to download a fairly large (150 GB+) archive. It always fails, but at different points, with an error like the following. Halp! My data is stuck in the Glacier!
caleb@krige:~/amazon-glacier-cmd-interface$ glacier-cmd -c ~/glacier.conf download --overwrite --outfile foo.tar backups BIGHASHTHINGIE
Traceback (most recent call last):e 3.29 MB/s, average 1.69 MB/s, ETA Mon, 04 Mar, 13:55:13.
File "/usr/local/bin/glacier-cmd", line 9, in <module>
load_entry_point('glacier==0.2dev', 'console_scripts', 'glacier-cmd')()
File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glacier.py", line 929, in main
args.func(args)
File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glacier.py", line 156, in wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/glacier.py", line 267, in download
out_file_name=args.outfile, overwrite=args.overwrite)
File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 65, in wrapper
ret = fn(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 231, in glacier_connect_wrap
return func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 65, in wrapper
ret = fn(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 252, in sdb_connect_wrap
return func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 65, in wrapper
ret = fn(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/glacier-0.2dev-py2.7.egg/glacier/GlacierWrapper.py", line 1373, in download
data = response.read()
File "/usr/local/lib/python2.7/dist-packages/boto-2.8.0-py2.7.egg/boto/glacier/response.py", line 48, in read
return self.http_response.read(amt)
File "/usr/local/lib/python2.7/dist-packages/boto-2.8.0-py2.7.egg/boto/connection.py", line 409, in read
self._cached_response = httplib.HTTPResponse.read(self)
File "/usr/lib/python2.7/httplib.py", line 548, in read
s = self._safe_read(self.length)
File "/usr/lib/python2.7/httplib.py", line 649, in _safe_read
raise IncompleteRead(''.join(s), amt)
httplib.IncompleteRead: IncompleteRead(3145728 bytes read, 13631488 more expected)
This is an httplib error, which means it most likely is a communication error between your system and Glacier. And I have no idea what could possibly be the cause of it. Unfortunately no information whatsoever in the python docs on this exception.
To get your data out, try to resume the download, instead of downloading the whole thing all over again. This should eventually get everything out.
Hi there. Thanks for the speedy reply!
--resume (which is part of the parallel uploads branch in your fork) hasn't yet been merged into this repo yet. I tried that branch out anyway, hoping it would work, but it seems fairly unstable (I got an error about a symbol "MB" being undefined, for instance). Any idea when this feature will get merged into HEAD?
Also, it seems like a better solution is for GlacierWrapper to catch the IncompleteRead exception and retry just the failed chunk. Or, am I misunderstanding something?