django-elasticsearch-dsl icon indicating copy to clipboard operation
django-elasticsearch-dsl copied to clipboard

Implement the Celery signal processor

Open CorrosiveKid opened this issue 6 years ago • 18 comments

The CelerySignalProcessor allows automatic updates on the index as delayed background tasks using Celery.

NB: We cannot process deletes as background tasks. By the time the Celery worker would pick up the delete job, the model instance would already deleted. We can get around this by setting Celery to use pickle and sending the object to the worker, but using pickle opens the application up to security concerns.

#34

CorrosiveKid avatar Mar 08 '18 03:03 CorrosiveKid

This is still work in progress. Need help writing the tests around the new signal processor.

CorrosiveKid avatar Mar 08 '18 03:03 CorrosiveKid

@CorrosiveKid do you still need help with tests?

andreyrusanov avatar May 07 '18 11:05 andreyrusanov

@andreyrusanov Yes please, if possible. I still haven't had the time to look at it after my initial commit.

CorrosiveKid avatar May 07 '18 23:05 CorrosiveKid

@CorrosiveKid could you give me write permissions to you repository? So I will update the PR

andreyrusanov avatar May 08 '18 05:05 andreyrusanov

@andreyrusanov Done!

CorrosiveKid avatar May 08 '18 05:05 CorrosiveKid

@ur001 @CorrosiveKid @sabricot

we have an issue with tests here. Signals could not be properly attached to models, used for tests. They connected, but don't triggered. What I've tried:

  • I used override_settings decorator (with it old signal were triggered anyway, unless I removed it explicitly)
  • tried to run DEDConfig().ready() again (with signal_processor field set to None)
  • tried to explicitly disable old signals and connect new one. I also tried to do lots of minor changes/things, but nothing works.

New signals were connected without issue, but when afterwards something happened with model Signal dispatcher doesn't find any receiver for it.. If anybody can take a look or have an idea how to fix it - please do!

I commited workaround I've made. It is really hacky in some point and I have really big doubts it will be approved. If so - I or @CorrosiveKid can just revert it back. If not - I will resolve conflicts/add extra docs.

andreyrusanov avatar May 08 '18 18:05 andreyrusanov

any update on this branch?

gkapatia avatar Jan 28 '19 13:01 gkapatia

@CorrosiveKid I see there is one issue with the current implementation. It calls a task for every save, while it could possibly call a task for multiple objects in one go using the elasticsearch_dls.helpers.bulk method. It is not simple though to implement this behaviour as it will need to then have a queue which stores objects of same types in different queues and then passed at certain interval to bulk update.

divick avatar Feb 20 '19 13:02 divick

@CorrosiveKid I suggest to use transaction.on_commit to prevent race condition.

transaction.on_commit(lambda: self.registry_update_task.delay(pk, app_label, model_name)) transaction.on_commit(lambda: self.registry_update_related_task.delay(pk, app_label, model_name))

emilio-rst avatar Apr 23 '19 19:04 emilio-rst

Is this branch still in development? Do you need any help?

vonmaster avatar Jul 08 '19 12:07 vonmaster

@vonmaster Any help is greatly appreciated, I haven't had the time to look at it yet.

CorrosiveKid avatar Jul 10 '19 03:07 CorrosiveKid

@CorrosiveKid any update on this branch?

ijharulislam avatar Nov 02 '19 06:11 ijharulislam

@CorrosiveKid any update on this branch?

It's quite stale at this point since I haven't had the time to pursue this :(

CorrosiveKid avatar Nov 13 '19 22:11 CorrosiveKid

@CorrosiveKid if you do ever end up finishing this you could handle deletes by getting the model class and creating a new instance of the model, but not saving it like get_model(app_label, model_name)(pk=pk) then padding that to the registry.delete

patrickml avatar May 01 '20 04:05 patrickml

Quick check @safwanrahman: does this still work on the latest package version? Are there any breaking changes? Not sure how this PR is being tested, but I think it'd be invaluable to re-run a Travis test by force pushing master and resolving conflicts.

Andrew-Chen-Wang avatar Jul 17 '20 19:07 Andrew-Chen-Wang

@Andrew-Chen-Wang This PR is actually out of date. Need to rebase against master and port it according to new code!

safwanrahman avatar Jul 17 '20 21:07 safwanrahman

Would love to see this picked up and finished!

cr0mbly avatar Mar 25 '21 19:03 cr0mbly

Would love to see this picked up and finished!

Please take a look. https://github.com/django-es/django-elasticsearch-dsl/pull/414

I rebased it against master and implement the celery delete and delete_related tasks.

Bidaya0 avatar Jul 21 '22 04:07 Bidaya0

#414 is better position currently.

safwanrahman avatar May 17 '23 22:05 safwanrahman