datapusher
datapusher copied to clipboard
Datapusher doe not update datasote using resource_update
When I update ckan dataset with resource_update it does not trigger the datapusher. If I use resource_create it does. I cant use datastore_update since not all my users can convert their data to json and not all datasets have unique ids. Is there a sane way to update a dataset?
So I Came accross this problem too. I have tweaked the resource_update method in controllers/package.py with below code
Custom Code to Automatically trigger datapusher job submit when sample data file change identified.
sample_file_modified = False
if not old_data['url'] and data['url']:
# Sample Data has Been added
log.info("Sample Data has Been added")
sample_file_modified = True
elif old_data['url'] and not data['url']:
# Sample Data has been removed
log.info("Sample Data Has been removed")
sample_file_modified = False
# Update Datapusher task status to reflect sample file deletion so old files data in sample store is not displayed to user.
task = {
'entity_id': resource_id,
'entity_type': 'resource',
'task_type': 'datapusher',
'last_updated': str(datetime.datetime.utcnow()),
'state': 'deleted',
'key': 'datapusher',
'value': '{}',
'error': '{}',
}
try:
task_id = p.toolkit.get_action('task_status_show')(context, { 'entity_id': resource_id,
'task_type': 'datapusher', 'key': 'datapusher' })['id']
task['id'] = task_id
except logic.NotFound:
pass
#
task_result = p.toolkit.get_action('task_status_update')(context, task)
#
elif old_data['url'].split('/')[-1] != data['url']:
# Sample Data Has been Updated
log.info("Sample Data Has been Updated")
sample_file_modified = True
elif isinstance(request.POST.get('upload'), cgi.FieldStorage):
# Sample Data Has been Updated
log.info("Sample Data Has been Updated")
sample_file_modified = True
#
if sample_file_modified == True:
log.info("$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$")
p.toolkit.get_action('datapusher_submit')(None,{'resource_id': resource_id})
problem with this approach is when we use ckanapi for any updations or data manipulations and new sample file uploads for bulk data using scripts. this custom trigger is missed out as ckanapi will tap straight into actions under logic skipping the controllers in between and then same problem will persist.