datapusher icon indicating copy to clipboard operation
datapusher copied to clipboard

Datapusher doe not update datasote using resource_update

Open asarchami opened this issue 7 years ago • 2 comments

When I update ckan dataset with resource_update it does not trigger the datapusher. If I use resource_create it does. I cant use datastore_update since not all my users can convert their data to json and not all datasets have unique ids. Is there a sane way to update a dataset?

asarchami avatar Oct 17 '17 20:10 asarchami

So I Came accross this problem too. I have tweaked the resource_update method in controllers/package.py with below code

Custom Code to Automatically trigger datapusher job submit when sample data file change identified.

        sample_file_modified = False
        if not old_data['url'] and data['url']:
            # Sample Data has Been added
            log.info("Sample Data has Been added")
            sample_file_modified = True
        elif old_data['url'] and not data['url']:
            # Sample Data has been removed
            log.info("Sample Data Has been removed")
            sample_file_modified = False
            # Update Datapusher task status to reflect sample file deletion so old files data in sample store is not displayed to user.
            task = {
                    'entity_id': resource_id,
                    'entity_type': 'resource',
                    'task_type': 'datapusher',
                    'last_updated': str(datetime.datetime.utcnow()),
                    'state': 'deleted',
                    'key': 'datapusher',
                    'value': '{}',
                    'error': '{}',
                   }
            try:
                task_id = p.toolkit.get_action('task_status_show')(context, { 'entity_id': resource_id,
                                                    'task_type': 'datapusher', 'key': 'datapusher' })['id']
                task['id'] = task_id
            except logic.NotFound:
                pass
            #
            task_result = p.toolkit.get_action('task_status_update')(context, task)
            #
        elif old_data['url'].split('/')[-1] != data['url']:
            # Sample Data Has been Updated
            log.info("Sample Data Has been Updated")
            sample_file_modified = True
        elif isinstance(request.POST.get('upload'), cgi.FieldStorage):
            # Sample Data Has been Updated
            log.info("Sample Data Has been Updated")
            sample_file_modified = True
        #
        if sample_file_modified == True:
            log.info("$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$")
            p.toolkit.get_action('datapusher_submit')(None,{'resource_id': resource_id})

rp21buzz avatar Apr 30 '18 11:04 rp21buzz

problem with this approach is when we use ckanapi for any updations or data manipulations and new sample file uploads for bulk data using scripts. this custom trigger is missed out as ckanapi will tap straight into actions under logic skipping the controllers in between and then same problem will persist.

rp21buzz avatar Apr 30 '18 11:04 rp21buzz