django-celery-results icon indicating copy to clipboard operation
django-celery-results copied to clipboard

Extending TaskResult model

Open afshinm opened this issue 6 years ago • 16 comments

I'm trying to find a way to extend the TaskResult model and add another field to it. I want to add request.user object to the job model in database so that I can query it later in my app.

Is subclassing a correct approach to do this? I'm asking this because subclassing creates another table in the database and then I need to manually sync the status of new model with what django-celery-results create.

afshinm avatar Feb 09 '18 17:02 afshinm

A better way would be to create a new model that links to TaskResult with a OneToOneField like this

class ExtraTaskInfo(models.Model):

    task_result = models.OneToOneField(
        TaskResult,
        on_delete=models.CASCADE,
    )
    task_creator = models.ForeignKey(
        AUserModel,
        on_delete=models.CASCADE
    )

You can then use a django post_save signal receiver to create this new model when a TaskResult is created. You could then have the user id and task id stored in cache as a buffer to then retrieve it in the signal receiver. Alternatively you could store an instance of the new model without the task_result field and have some other field like task_id to retrieve it in the signal processor.

This is a bit of a hack but it works well. If anyone has a better way let us know

harrywhite4 avatar Feb 16 '18 04:02 harrywhite4

That is sound advice for sure @harrywhite4, but I am running into an issue where the signal is not received upon initial creation. This is due to the fact that get_or_create is used, which is wrapped in an atomic transaction. As such, pre_save and post_save signals do not fire. Can you offer any alternatives or do I actually need to use a string field of the task ID to link the objects instead of proper foreign keys? Thanks!

EDIT: It seems that the issue here might relate more to the way that the objects are being created in the app. If I create an instance of the TaskResult model on the REPL (manage.py shell) I receive the signal. However, when an instance is created by the app (when I create a celery task) I do not receive the signal. Any guidance would be much appreciated, thanks!

mrname avatar Apr 02 '18 22:04 mrname

Not sure why that would be happening. For me a post_save signal is fired when creating a celery task. Are you sure that a TaskResult is actually being saved?

Also if you want the result to be saved when the task is created instead of when it's completed you have to add an extra setting as described here https://github.com/celery/django-celery-results/issues/39

harrywhite4 avatar Apr 03 '18 05:04 harrywhite4

Thanks for the quick response @harrywhite4. Yes, I have confirmed that in these situations the object is indeed created, despite the signal not being received. As a workaround, I will probably go the route that you mentioned, forcing the result to be saved as soon as the task is created and using the resulting object as opposed to trying to use the signals.

The fact that you are unable to reproduce leads me to believe that some other library or middleware I am using is interfering with the signals for this specific model.... as much as that does not make sense.

I see that a similar issue was filed here:

https://github.com/celery/django-celery-results/issues/41

If you wish, I can provide some details about my environment on that issue. Thanks again!

mrname avatar Apr 03 '18 15:04 mrname

I had a similar issue with this one. Mine was related to django autocommit mode. See django docs.

from django.db import transaction

def some_celery_task(arg1):
        pass
@receiver(post_save, sender=TaskResult)
def update_reports(sender, instance=None, created=None, **kwargs):
    if  created:
           transaction.on_commit(lambda:  some_celery_task.delay('arg1'))

atkawa7 avatar Dec 26 '18 12:12 atkawa7

Edit: The Signal is indeed send, but you have to restart celery every time you change your python signal_receivers code, otherwise it will not pick up the changes.

I still don't know how to store the user and task id in cache as harry has suggested.

task = make_thumbnails.delay(file_path, thumbnails=[(128, 128)])
cache.set(task.id, request.user.username, 300)

but the signal_receiver can never find the key in the cache

@receiver(post_save, sender=TaskResult)
def update_reports(sender, instance=None, created=None, **kwargs):
    if created:
        retrieve = cache.get(instance.task_id, 'not found')
       print(retrieve)

will always return "not found"

falc410 avatar Oct 27 '19 12:10 falc410

For who is having a trouble like @falc410, the solution (at least in my case) is, changing django's cache backend. Local-memory caching seems not working in this type of scenario, but when I tried to filesystem cache, it worked. And I believe that in any type of shared cache backend, it should work. Details: https://docs.djangoproject.com/en/3.1/topics/cache/#filesystem-caching

sseyren avatar Sep 16 '20 11:09 sseyren

A better way would be to create a new model that links to TaskResult with a OneToOneField like this

class ExtraTaskInfo(models.Model):

    task_result = models.OneToOneField(
        TaskResult,
        on_delete=models.CASCADE,
    )
    task_creator = models.ForeignKey(
        AUserModel,
        on_delete=models.CASCADE
    )

You can then use a django post_save signal receiver to create this new model when a TaskResult is created. You could then have the user id and task id stored in cache as a buffer to then retrieve it in the signal receiver. Alternatively you could store an instance of the new model without the task_result field and have some other field like task_id to retrieve it in the signal processor.

This is a bit of a hack but it works well. If anyone has a better way let us know

A better way would be to create a new field in the TaskResult that links to AuthUserModel with a ForeignKey like this

from django.db import models
from django_celery_results.models import TaskResult
TaskResult.add_to_class('task_creator', models.ForeignKey(AUserModel, on_delete=models.CASCADE))

and with this migration

from django.conf import settings
from django.db import migrations, models
import django.db.models.deletion


class Migration(migrations.Migration):
    replaces = [('django_celery_results', '<migrations file name>'),]

    def __init__(self, name, app_label):
        super(Migration, self).__init__(name, app_label)
        self.app_label = 'django_celery_results'

    dependencies = [
        ('django_celery_results', '0001_initial')
    ]

    operations = [
        migrations.AddField(
            model_name='taskresult',
            name='task_creator',
            field=models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to=settings.AUTH_USER_MODEL),
        ),
    ]

alirezasafi avatar Dec 03 '20 13:12 alirezasafi

I just stumbled upon a very similar requirement to extend the TaskResult model.

@alirezasafi would you be able to confirm how would you populate the new filed once the task_creator is added?

gu33mis avatar Jan 01 '21 11:01 gu33mis

A better way would be to create a new field in the TaskResult that links to AuthUserModel with a ForeignKey like this

from django.db import models
from django_celery_results.models import TaskResult
TaskResult.add_to_class('task_creator', models.ForeignKey(AUserModel, on_delete=models.CASCADE))

we could add the suggestion to docs

auvipy avatar Jan 20 '21 19:01 auvipy

also in the future converting the models to the abstract base model will make them more extendable

auvipy avatar Feb 10 '21 13:02 auvipy

@alirezasafi How to populate the fields after the "task_creator" is created ?

FengCn avatar Apr 24 '22 01:04 FengCn

@alirezasafi How to populate the fields after the "task_creator" is created ?

By caching the user and task (user-id and task-id). after creating the task, use post_save signal to update the field.

alirezasafi avatar Apr 24 '22 06:04 alirezasafi

@alirezasafi thks for your prompt response,now the model(extrataskinfo) is created and received the task_result instace in celery signal function. My question is which method is better?(Extra_task_info or new field(task_creator) in the TaskResult)

FengCn avatar Apr 24 '22 14:04 FengCn

@alirezasafi thks for your prompt response,now the model(extrataskinfo) is created and received the task_result instace in celery signal function. My question is which method is better?(Extra_task_info or new field(task_creator) in the TaskResult)

I think the first option is more appropriate. Cache isn't reliable and data may be lost.

alirezasafi avatar Apr 24 '22 15:04 alirezasafi

This would be helpful in the docs.

lggwettmann avatar Dec 16 '22 12:12 lggwettmann