Automatically import from file upload/update
I'm testing out CKAN 2.2 with the DataPusher, instead of the old DataStorer.
The old DataStorer ran a cronjob every X hour to check for updates to the DataStore.
It seems as if the DataPusher does not do this. Are there a way for having the DataPusher check resources every X something for any updates?
Basically the problem, for me, is when I use the new FileStore API and update a file, the DataStore does not get updated. It seems as the DataPusher is only being called upon file creation and URL change.
I also have several WFS-services as resources and when they change, the DataPusher does not update the DataStore.
Can we get an update on this?
I've written a dirty python script for this, which will iterate over the resources i would like to update, changing the url slightly before changing it back (Appending a '&' works in most cases), effectively starting the datapusher. It would be lovely to have built-in support for this instead.
Maybe the paster datapusher can help with this?
paster datapusher
Perform commands in the datapusher
Usage:
resubmit - Resubmit all datastore resources to the datapusher,
ignoring if their files haven't changed.
submit <pkgname> - Submits all resources from the package
identified by pkgname (either the short name or ID).
submit_all - Submit every package to the datastore.
This is useful if you're setting up datastore
for a ckan that already has datasets.
That or calling datapusher_submit with the resource id on the script you are using to reupload the file.
Okay nice one! Is this documented somewhere? I had a hard time locating this information.
@NicolaiMogensen I doubt it, it would be great if you could add it here: http://docs.ckan.org/en/latest/maintaining/datastore.html#datapusher-automatically-add-data-to-the-datastore and send a PR
I'll look into it, once i've dived into the code, the code on github is not the same as the datapusher code thats packaged with CKAN if you do a package install correct? What's the best way to go about that, documentation wise?
I dont have submit_all is that on a new version? I'm running 2.5.2
This is what i have:
Perform commands in the datapusher
Usage:
resubmit - Resubmit all datastore resources to the datapusher,
ignoring if their files haven't changed.
submit <pkgname> - Submits all resources from the package
identified by pkgname (either the short name or ID).
The code for these docs is on the main ckan repo: https://github.com/ckan/ckan/blob/master/doc/maintaining/datastore.rst
The DataPusher shipped on the package install is this same repo (it might be an older version).
The submit_all command is on current CKAN master, so it will be available on 2.6. Or you can pick the changes as they are quite trivial: https://github.com/ckan/ckan/pull/3024