data.gov
data.gov copied to clipboard
Create Clear for Harvest2.0 Harvest source
User Story
In order to be able to reset/restart a harvest source, data.gov admins want a clear function/API route.
Acceptance Criteria
[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]
- [ ] GIVEN the
/harvest/{id}/clear
API route is created
WHEN I call the harvest API at/harvest/{id}/clear
THEN the datasets are removed from CKAN
AND the dataset records/errors/jobs are removed from the harvest DB
Background
Very helpful for testing, and occasionally useful for resetting a harvest source that has become corrupted or out of sync. Similar to CKAN clear functionality.
Security Considerations (required)
Should require authentication, but no security additions required.
Sketch
Eventually the CKAN removal piece may become so cumbersome (ie take so long, longer than the restart time [15 minutes?]) that we'll want to implement that piece as a subtask. For this instance, just utilize the API normally. Simply try to run a CKAN dataset purge. Also run the DB delete/clear commands. Ideally if everything is synced correctly, you should be able to remove the harvest jobs and let everything flow to delete the other foreign objects, but might require config changes or workarounds.