ckanext-harvest icon indicating copy to clipboard operation
ckanext-harvest copied to clipboard

Option to clear only old history entries

Open torfsen opened this issue 7 years ago • 2 comments

Currently we have the clearsource_history paster command for deleting old jobs but keeping the sources and the datasets. This is nice, but would be even better if we could use it to only delete the older jobs and keep more recent ones, e.g. something like

paster clearsource_history --older-than 30d

to remove all jobs older than 30 days.

torfsen avatar Jun 20 '17 06:06 torfsen

Is this planned ?

I think this is more than nice-to-have. I had to clear history for a harvest source (db got bloated because of the many harvest objects). But by doing so, all harvest objects being gone, the harvester (CSW in my case) could no longer find its previous harvested objects, and re-harvested the whole source. And duplicated the data.

So I think clearsource_history should at least preserve the most recent/currently active harvest object.

pduchesne avatar Jun 01 '19 08:06 pduchesne

Related PR: https://github.com/ckan/ckanext-harvest/pull/484 With the new option -k true (with the CKAN click command) the latest harvest jobs with the current harvest objects will be preserved.

seitenbau-govdata avatar May 18 '22 08:05 seitenbau-govdata