mik icon indicating copy to clipboard operation
mik copied to clipboard

OAI toolchain: provide options for rate limiting

Open mjordan opened this issue 7 years ago • 3 comments

The OAI-PMH harvester MIK uses, https://github.com/caseyamcl/phpoaipmh, doesn't provide any built-in rate limiting options like pausing, etc, but its README points out that https://github.com/guzzle/retry-subscriber can be used for this sort of thing. We should add rate-limiting options so brittle OAI-PMH providers don't crash as often when we point MIK at them.

mjordan avatar Mar 07 '17 23:03 mjordan

@bondjimbond can we close this since we have #344?

mjordan avatar Mar 22 '17 16:03 mjordan

The fix for #344 is related but different - but perhaps good enough for now? Are there other use cases for rate limiting specifically that aren't covered in #344?

bondjimbond avatar Mar 22 '17 18:03 bondjimbond

Fair enough - but AFAIK implementing this will be non-trivial since we'll need to dig into the OAI-PMN harvester. Let's leave it open in case someone gets the time to take a closer look.

I can't think of a more specific use case for rate limiting on the OAI harvest other than to not swamp the OAI provider, but as I understand it, the whole reason they included resumption tokens was to allow for a built-in rate limit, or at least, a limit on how many records were returned at one time. Fetching the files is outside the scope of OAI-PMH so I don't think that was a consideration when they thought about resumption tokens.

mjordan avatar Mar 22 '17 19:03 mjordan