ckanext-datastorer
ckanext-datastorer copied to clipboard
The same resource is ingested twice?
This behaviour is observed in an instance running
- ckan at 2.2
- ckanext-archiver at current master
- ckanext-datastorer at current master
Consider the case when a new resource is uploaded.
I think that when archiver's download
is trying to update metadata for the given resource (https://github.com/ckan/ckanext-archiver/blob/master/ckanext/archiver/tasks.py#L451) is causing a new IDomainObjectModification
event to be fired. Thus, datastorer is notified again (because of this else clause: https://github.com/ckan/ckanext-datastorer/blob/master/ckanext/datastorer/plugin.py#L34) and a new task is sent to the queue.
I suppose that since the time of arrival of the second event is random (and of course the queue can run many parallel workers), this can lead to undesirable races if 2 parallel tasks are sending groups of records to the datastore table (?).