ArchiveBot
ArchiveBot copied to clipboard
Failing a job does not remove it from the queue
This just happened on pipeline:a519b67335426a5c4296e4df9049d7d5:
Starting CheckIP for Item
Checking IP address.
Finished CheckIP for Item
Starting GetItemFromQueue for Item
Received item 3jaycce28r7ws0rjznlaw15a0.
Starting StartHeartbeat for Item
Finished StartHeartbeat for Item
Starting SetFetchDepth for Item
Finished SetFetchDepth for Item
Starting PreparePaths for Item
Finished PreparePaths for Item
Starting WriteInfo for Item
Finished WriteInfo for Item
Starting DownloadUrlFile for Item
Starting WgetDownload for Item
Failed WgetDownload for Item
Traceback (most recent call last):
File "/home/archivebot/.local/lib/python3.5/site-packages/seesaw/pipeline.py", line 61, in _enqueue_with_except
task.enqueue(item)
File "/home/archivebot/.local/lib/python3.5/site-packages/seesaw/externalprocess.py", line 189, in enqueue
self.process(item)
File "/home/archivebot/.local/lib/python3.5/site-packages/seesaw/externalprocess.py", line 197, in process
args=realize(self.args, item),
File "/home/archivebot/.local/lib/python3.5/site-packages/seesaw/config.py", line 27, in realize
return v.realize(item)
File "/home/archivebot/ArchiveBot/pipeline/archivebot/seesaw/wpull.py", line 123, in realize
self.warc_max_size)
File "/home/archivebot/ArchiveBot/pipeline/archivebot/seesaw/wpull.py", line 55, in make_args
if item['url'].startswith("http://www.reddit.com/") or \
AttributeError: 'NoneType' object has no attribute 'startswith'
Waiting 10 seconds...
3jaycce28r7ws0rjznlaw15a0 was an !ao < job that I had manually failed before the pipeline was running. It looks like job.fail doesn't remove the job from the queue.
More specifically, it looks like job.fail only removes the job from the standard queue but not from pending-ao, pending-large, or any of the --pipeline queues (pending:*).