django-celery-results icon indicating copy to clipboard operation
django-celery-results copied to clipboard

django-celery-results doesn't update state correctly with task.update_state()

Open keeper opened this issue 7 years ago • 8 comments

When using sample code in celery doc:

@app.task(bind=True)
def upload_files(self, filenames):
    for i, file in enumerate(filenames):
        if not self.request.called_directly:
            self.update_state(state='PROGRESS',
                meta={'current': i, 'total': len(filenames)})

The task's state doesn't change, and the meta data is actually written to result field.

This only happens when using CELERY_RESULT_BACKEND = 'django-db'. When using SQLAlchemy and set the backend directly to database, the progress field is updated to PROGRESS correctly.

I'm currently using:

celery (4.1.0)
Django (1.11.4)
django-celery-beat (1.0.1)

My broker is Rabbitmq, and database is PostgreSQL

keeper avatar Nov 06 '17 08:11 keeper

Exactly the same problem here. I was writing something similar like:

self.update_state(
    state = states.FAILURE,
    meta = 'Some error message here'
)

# ignore the task so no other state is recorded
raise Ignore()

I'm using:

Django(1.11.7)
Celery(4.1.0)
django-celery-results(1.0.1)
RabbitMQ(3.6.14)
Sqlite3

The problem is: the task's state doesn't change when using django-db as the backend(CELERY_RESULT_BACKEND = 'django-db'). More specifically, the task state is always shown as STARTED on flower but the celery console tells us that the state has been updated and the Ignore exception had been raised:

[2018-01-07 18:53:54,452: INFO/MainProcess] Received task: webapp.upload.importEventFile[e8ab9bfd-be49-43d2-97da-5d6be2e1a238]
[2018-01-07 18:53:57,768: INFO/ForkPoolWorker-6] Task webapp.upload.importEventFile[e8ab9bfd-be49-43d2-97da-5d6be2e1a238] ignored

senyuuri avatar Jan 08 '18 05:01 senyuuri

I have the same problem, too. After calling update_state with custom meta object, I can't see the update on the database.

I'm using MySQL as the result backend.

afshinm avatar Feb 13 '18 12:02 afshinm

I think it's working as designed, but the documentation probably needs some clarification. Especially since store_result and meta are fields that are used in different ways in different contexts. Note: I am using the version on master, labeled as release 1.1.1; not the version on PyPi, labeled as 1.0.1; I highly recommend since the task name has been added to TaskResult, very helpful in identifying which result belongs to which task.

While a task is executing, the TaskResult is used really as current task state; holding information, temporarily, until the task completes.

So, when celery's update_state is called from within a task if there is no TaskResult it is created and the meta information placed in the results field and with an updated status. Subsequent calls calls to update_state update the same TaskResult, overwriting what was there previously.

Upon completion of the task, the results of the task are stored in the same TaskResult, overwriting the previous state of the task: the return from the function in results and status set to 'SUCCESS' (or 'FAILURE').

The TaskResult's meta field is used for a different purpose (I believe to capture the results of child tasks?)

TL;DR

celery/app/task.py calls the store_result method on the backend , not the manager. It is implemented by super class of django_celery_results.backends.DatabaseBackend and gets passed off to _store_result here: https://github.com/celery/django-celery-results/blob/master/django_celery_results/backends/database.py#L16. The backend, in turn, calls store_result on the manager, with the proper signature within the above method.

http://docs.celeryproject.org/en/latest/userguide/tasks.html#custom-states

@senyuuri as soon as you raise the Ignore exception, the task completes (with a status of 'FAILURE'), sets the traceback and clears the result; any previous state information is no longer, unfortunately. You can test this by putting a time.sleep(5*60) between the update_state and raising the Ignore() and examining the console or database in between.

amirskysmartasset avatar Apr 28 '18 20:04 amirskysmartasset

Is there a way that we can manage the task completed and total ourselves? I'm look at a use case where I don't want parallelism, all stuff are done in one single task, and I want to have control over the completed/total value of the task in order to maintain a progress bar in the frontend.

Is this possible in current Celery?

rivernews avatar Oct 15 '19 18:10 rivernews

I think it's working as designed, but the documentation probably needs some clarification. Especially since store_result and meta are fields that are used in different ways in different contexts. Note: I am using the version on master, labeled as release 1.1.1; not the version on PyPi, labeled as 1.0.1; I highly recommend since the task name has been added to TaskResult, very helpful in identifying which result belongs to which task.

While a task is executing, the TaskResult is used really as current task state; holding information, temporarily, until the task completes.

So, when celery's update_state is called from within a task if there is no TaskResult it is created and the meta information placed in the results field and with an updated status. Subsequent calls calls to update_state update the same TaskResult, overwriting what was there previously.

Upon completion of the task, the results of the task are stored in the same TaskResult, overwriting the previous state of the task: the return from the function in results and status set to 'SUCCESS' (or 'FAILURE').

The TaskResult's meta field is used for a different purpose (I believe to capture the results of child tasks?)

TL;DR

celery/app/task.py calls the store_result method on the backend , not the manager. It is implemented by super class of django_celery_results.backends.DatabaseBackend and gets passed off to _store_result here: https://github.com/celery/django-celery-results/blob/master/django_celery_results/backends/database.py#L16. The backend, in turn, calls store_result on the manager, with the proper signature within the above method.

http://docs.celeryproject.org/en/latest/userguide/tasks.html#custom-states

@senyuuri as soon as you raise the Ignore exception, the task completes (with a status of 'FAILURE'), sets the traceback and clears the result; any previous state information is no longer, unfortunately. You can test this by putting a time.sleep(5*60) between the update_state and raising the Ignore() and examining the console or database in between.

can you please send a doc updating request?

auvipy avatar Oct 12 '21 04:10 auvipy

Is there any change to this functionality in the latest version of django-celery-results. I am facing the same issue as @keeper. The update_state() is not working for me either.

prakharrathi25 avatar Jun 23 '22 10:06 prakharrathi25

same issue

amirhoseinbidar avatar Aug 10 '22 16:08 amirhoseinbidar

Bumped here. Might submit soon a PR with changes to permit the injection of arbitrary data inside the meta field of TaskResult.

Is it desired?

rodrigondec avatar Feb 08 '23 20:02 rodrigondec

I'm open to any relevant contribution

auvipy avatar Feb 09 '23 05:02 auvipy

PR Submitted :grin:

rodrigondec avatar Feb 24 '23 14:02 rodrigondec