bigbang
bigbang copied to clipboard
BadStatusLine while collecting examples/urls.txt from Mailman
I ran this command from the bigbang environment
(bigbang)~/bigbang dsg$ python bin/collect_mail.py -f examples/urls.txt
After successfully retrieving score of archives I received the following error message.
File "/Users/dsg/anaconda/envs/bigbang/lib/python2.7/httplib.py", line 373, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine: ''
Full Traceback: https://gist.github.com/danielsgriffin/8cdc1d151ded54b8a3bf#file-gistfile1-txt
Note that when the collect_mail script breaks like this, it does not uncompresss the other archive files. This is not an optimal failure mode.
BadStatusLine indicates that the server responded to the request but did not send a valid HTTP entity - in this case it seems that the server closed the connection forcefully. It might be triggered by the high number of requests being made in rapid succession, in which case throttling the request rate and retrying failed requests would probably raise the success rate for these requests. It's not really possible to ensure that it always works so a more graceful handling of failed downloads would still be appropriate IMO.