celery-haystack
celery-haystack copied to clipboard
Limit indexing to single task a time
I would like to use haystack + celery-haystack + xapian-haystack. Unfortunatelly "Because Xapian does not support simultaneous WritableDatabase connections. If this occurs an DatabaseLockError exception will be raised by Xapian."
So I thought about limiting tasks number to only one by time. Didn't see a proper setting for that, because as far as I understand your code, the COMMAND_WORKERS=1 is not the solution for that.
I am also having this problem. My Celery instance is used by numerous other tasks and therefore has concurrency > 1.
I have numerous other tasks that result in many Django models being created or updated, resulting in many CeleryHaystackSignalHandler tasks being queued, but only one succeeds. The rest get retried with a 300s timeout.
Am I missing something? It seems a major design oversight to assume celery with a concurrency of 1. I understand that there is a limitation of Xapian to only allow one writer to access the database at a time, so maybe either artificially limit the task to one instance at a time, or fundamentally change the way this works by batching up the writes into one or more transactions?
This will, at least partially, be mitigated in celery-haystack-ng 2.0. This new version really enqueues updates (in the haystack sense) until teardown of the signal processor. In combination with running in a transaction, this results in exactly one task being scheduled for all update operations when the transaction commits.