oai-harvest icon indicating copy to clipboard operation
oai-harvest copied to clipboard

Harvester Timed Out

Open bwagerson opened this issue 5 years ago • 3 comments

Harvester timed out at about 677,989 out of 1,500,000 items while trying to harvest all of Arxiv.org, is there a way to pick the harvest back up where it timed out? Instead of starting at the beginning?

bwagerson avatar Mar 20 '19 15:03 bwagerson

Hi. See #22 for some conversation around this

bloomonkey avatar Mar 21 '19 11:03 bloomonkey

Sorry, but is the definitive answer to provide the resumption token? How would we get that from oai-harvest?

ericywl avatar Apr 12 '19 12:04 ericywl

Yes, the resumptionToken is the only mechanism in OAI-PMH for resuming a previous harvesting run. I'm not sure how you'd get access to the necessary token though, as it's probably only used internal to oaiharvest to retrieve the next chunk 🤔 . Maybe it could store the token in a local file called e.g. .resumptionToken and use this as a default for the -r option if not value is provided...

bloomonkey avatar Apr 12 '19 14:04 bloomonkey