django-celery-beat icon indicating copy to clipboard operation
django-celery-beat copied to clipboard

Enabling a cron schedule task will run it immediately if it should have ran while it was disabled

Open mitom opened this issue 2 years ago • 8 comments

Summary:

This is a duplicate of #383 with an extended scope. The issues was closed with inactivity without resolution.

If the beat is stopped or a cron scheduled task is disabled, when the action is undone (i.e. beat started again, or task re-enabled) the task will be scheduled to run immediately AND then scheduled to run when the next execution is due too.

  • Celery Version: 4.4.7
  • Celery-Beat Version: 2.2.0

Exact steps to reproduce the issue:

  1. Create a crontab based task that runs every hour (the frequency doesn't really matter)
  2. Stop the beat between the 1st and 2nd execution
  3. Start the beat a few minutes before the 3rd execution
  4. The task will run twice in a few minutes (once when the beat is started, once when the 3rd execution is scheduled)

Detailed information

If I interpret https://github.com/celery/django-celery-beat/blob/master/django_celery_beat/tzcrontab.py#L45-L50, if we have a cron schedule like 0 * * * * (run at minute 0 of every hour), if we stop the beat at 00:55 (or at any point between 00:00 and 01:00) and then start it again at 01:55, the behaviour will be (sorry for this being so wordy, I'm writing it out to see if there is something I missed):

  • check remaining estimate since last run (last run will be 00:00, it should have ran at 01:00, so the delta is -00:55)
  • check if the delta is in the past (max of -00:55 and 0, is 0, then 0 == 0)
  • if the delta was in the past, re-calculate when the next execution should be (in our case it'd be 02:00)
  • schedule the task as due now with the next execution in 5 minutes

This would mean the task is scheduled at 01:55 and also at 02:00. I tried to see if disabling and enabling the task behaves differently but it seems to be the same. https://github.com/celery/django-celery-beat/blob/master/django_celery_beat/schedulers.py#L109 Includes some logic around the start date, but even if we set the start date to the time when the next execution should be, it looks like it'd still schedule the task immediately on start with a delay until the start date.

What I'm looking for is a way to have scheduled tasks which simply skip executions if the beat is down (i.e. in the above example if the beat is stopped at 02:00, the 2nd execution is simply skipped, rather than scheduled at a different time when the beat comes back).

mitom avatar Apr 14 '22 11:04 mitom

You can try it.

https://github.com/celery/django-celery-beat/pull/660

BaiJiangJie avatar Jul 11 '23 02:07 BaiJiangJie

You can try it.

#660

I hope this fixes the issue. let us know if not

auvipy avatar Jul 25 '23 08:07 auvipy

This does not fix it and is unrelated to this PR. The reason this occurs is that tasks are evaluated wether they should be run using the last_run_at field. The current time is not taken into consideration in determining wether a task 'is_due' or not. A long term solution to this would be to use the current time when initialising the ModelEntry class and only use last_run_at as a logging field. If you want to replicate this functionality immediately, you can extend the DatabaseSchedulerclass and rewrite the all_as_schedule method, setting last_run_at to the current time. This whole thing is more of a design change than a bug and should probably be offered as a feature flag.

mark-at-kluster avatar Aug 11 '23 08:08 mark-at-kluster

You may not understand the core reason. The fundamental problem is that the last_run_at field is temporarily set. When all_as_schedule is called, the old field value is pulled from the database again, overwriting the temporarily set field value, causing the subsequent logical judgment to be inaccurate. , thus executing repeatedly. @mark-at-kluster

BaiJiangJie avatar Sep 13 '23 03:09 BaiJiangJie

This may be the case for the issue you fixed in your PR, however I don't believe this is the case for this issue. This is because you say all_as_schedule overwrites the old field value, but all_as_schedule only loops over enabled tasks. In this case, the task was previously disabled so wouldn't be included in the previous all_as_schedule and thus the schedule. This means it couldn't have been 'overwitten'.

This is the fix I came up with to solve this issue and #667 and have never had this issue since.

As mentioned, ideally setting last_run_at to now would be within ModelEntry and enabled by flag (RUN_RETROSPECTIVE = False maybe?) as seems like a design change.

class ModifiedDatabaseScheduler(DatabaseScheduler):

    def all_as_schedule(self):
        debug('DatabaseScheduler: Fetching database schedule')
        s = {}
        now = datetime.now(self.app.timezone)
        for model in self.Model.objects.enabled():
            try:
                entry = self.Entry(model, app=self.app)
                if entry.model.start_time and entry.model.start_time < now:
                    entry.last_run_at = now
                s[model.name] = entry
            except ValueError:
                pass
        return s

mark-at-kluster avatar Sep 13 '23 09:09 mark-at-kluster

What I have personally done to fix this is extend the DatabaseScheduler class and set the last run at value to the current time.

@mark-at-kluster thanks, bro.you saved my day!

ohperhaps avatar Mar 14 '24 08:03 ohperhaps