redbeat
redbeat copied to clipboard
rrule: first run delayed
@concreted : When I use recurrent rule, the first action isn't taken at time.
Example (as you can see, the first run is delayed from 21:20:03 to 21:19:58 without a clear reason):
test.py
from celery import Celery
app = Celery()
@app.task
def test():
print('test')
run.py
from redbeat.schedules import rrule
from redbeat import RedBeatSchedulerEntry
from test import test, app
if __name__ == '__main__':
schedule = rrule('MINUTELY', interval=3, count=4)
print(schedule) # <rrule: freq: 5, dtstart: 2017-12-09 20:17:03.882+00:00, interval: 3, count: 4, ...>
entry = RedBeatSchedulerEntry('test', test.name, schedule, app=app)
entry.save()
out.log
[2017-12-09 21:19:58,320: INFO/MainProcess] Received task: test.test[7298979f-333e-4aae-b83d-f09cd7e2e7c0]
[2017-12-09 21:20:03,028: INFO/MainProcess] Received task: test.test[832d9876-8171-4c58-9835-ad5013e4b95d]
[2017-12-09 21:23:03,016: INFO/MainProcess] Received task: test.test[b7de89c3-c010-47de-a8c2-49350ff58050]
[2017-12-09 21:26:03,018: INFO/MainProcess] Received task: test.test[1050295c-0ea8-4824-89b2-ac71efa2c1ba]
Another (seems related) problem, is that if I set the whole rrule before current time, it's not scheduled at all.
dt = datetime.now() - timedelta(minutes=15)
schedule = rrule('MINUTELY', dtstart=dt, interval=3, count=4)
entry = RedBeatSchedulerEntry('test', test.name, schedule, app=app)
print(entry.is_due()) # schedstate(is_due=False, next=2699.667361)
@concreted, are you able to investigate?
Planning to investigate this week, thanks for the repro steps.
This looks related to the fact that Celery sets ScheduleEntry.last_run_at
to utcnow()
by default. When I forced the rrule
entry's last_run_at
to be None
, it behaved as expected (runs the first occurrence immediately, next occurrences run according to schedule):
run.py
schedule = rrule('MINUTELY', interval=1, count=2)
print(schedule) # <rrule: freq: 5, dtstart: 2017-12-10 19:31:22.637379+00:00, interval:
entry = RedBeatSchedulerEntry('test', test.name, schedule, app=app())
entry.last_run_at = None
entry.save()
out.log
[2017-12-10 19:31:25,163: WARNING/ForkPoolWorker-1] test
[2017-12-10 19:32:22,011: WARNING/ForkPoolWorker-2] test
@yoch As a workaround can you try setting entry.last_run_at = None
on your rrule
entry before saving, and see if it works correctly for you?
@sibson What do you think is the right approach here? I tried setting last_run_at
in RedbeatSchedulerEntry.__init__
to either None
or epoch, but that broke a lot of tests. I'm worried there are things that rely on Celery's default behavior.
@concreted: Thank you !
Your workaround works for me in the first case, but not for the second one (only the first run was fired).
@yoch Thanks for confirming. I will be looking into the second case - want to confirm, what is your expected behavior for your example? Would you expect that all four occurrences fire immediately, because they should have all occurred in the past?
@concreted : Yes, I think all these occurrences should fire almost immediately, or at least one at each beat.
In fact, the more realistic case is when for some reason (e.g. a congestion), one rrule event fires after that the next one is already due, which may cause to skip some events.
My feeling is celery's behaviour of defaulting last_run_at = now()
, is wrong and I've had to work around it. I don't fully understand the ramifications of changing the constructor to default to epoch. None, won't work as celery will override it later. I've lost track of the various things I tried and why they failed to work. So maybe a fresh set of eyes will be able to solve it. RRule is new enough I'd be open to making an isolated breaking change there if needed.
I'm looking to get a 2.0.0 release out. Is there a change to rrule that can be made before then to resolve this issue?
@sibson When are you hoping to get a release out? I may be able to take a look this week. Will confirm tomorrow.
I don't have a date in mind, I can wait for this to be resolved if you are planning to work on it.
I will spend some time tomorrow to investigate this and let you know if I'll be able to fix. Thanks!
I looked into this yesterday. There are two separate issues here:
- rrules without dtstart set don't fire their first event. Expected: first event fires.
- rrules with dtstart in the past don't fire any events. Expected: all past events fire quickly immediately after entry created.
Currently I'm using last_run_at
as a pointer in time so that rrule can return the next occurrence after that.
I believe issue with 1) is that Celery's last_run_at
is wrongly set to now() by default by Celery, and this is getting saved as the meta: last_run_at
. If you don't set a dtstart
on the rrule, I have it default to now()
. Because generally schedules are created first, then RedBeatSchedulerEntry
s, this means the entry's last_run_at
is always after the first occurrence unless you explicitly set dtstart
to some time in the future.
The issue with 2) is somewhat related, in that if last_run_at
is after all rrule occurrences were supposed to occur, ask for the next occurrence will return nothing. This is expected and correct behavior for the underlying rrule object, but isn't the expected behavior for task scheduling.
What I'm thinking is that rrule functions best when the last_run_at
time is exactly the time of the last rrule occurrence that ran - by that I mean the rrule occurrence's actual scheduled time to run, not the time that Celery actually ran the task. This would solve both problems, since it would always know exactly when to calculate the next occurrence from, instead of using the execution time.
I have a branch here (https://github.com/sibson/redbeat/compare/master...concreted:rrule-first-run-delayed?expand=1#diff-f064d9e67be22082f4bff1e5ecee4a3cR43) that demonstrates the correct behavior for both cases if is_due()
is called directly on an rrule entry (see run.py
). It doesn't work with Redbeat, because Redbeat would have to store and pass the last scheduled task time data back to the rrule.
@sibson What are your thoughts on how to address this? I thought of using some new metadata e.g. last_scheduled_task_at
, and passing that instead of last_run_at
to the entry as a special case for rrule. Another way could be to update the schedule
definition object itself with this info (see usage of self.last_run_at
in redbeat/schedules.py
). But dynamically updating the definition also doesn't seem like a great idea.
Setting last_run_at
to None or epoch as a special case for rrule would solve 1) but not 2). For example, if you had an rrule set to run for 4 occurrences every second on Jan 1 1990, if we set last_run_at
to epoch, only the first event would run because last_run_at
would immediately be set to after the 4 expected occurrences.
I've run into similar issues with celery's defaulting last_run_at to now() behaviour. I don't think I found a satisfying solution then either. I do wonder if we need to solve this problem more generally to make the rrule solution simpler to implement.
I'd like to find away to localize the rrule special behaviours to rrule related classes without unduly impacting other parts of the code. That makes it easier to reason about in future. I don't know how that influences which is the better option of the two you proposed. Unless, I'm missing a use case I'm inclined to worry less about possible edge cases in 1990, than getting 2019 onwards working as expected.
Hello,
It seems I am having a similar problem: When I start my task without providing dtstart, task is straight loaded and fired. Everything works fine !
Problem is, when I specify dtstart, task is loaded at desired start date but not fired ! It never starts
Any idea why my task does not get fired when dtstart is provided ?
I have been searching a lot around the issue, and found nothing. I would really appreciate if you could help me on this.
add_task.py:
date_delayed = datetime.datetime(2019, 8, 29, 16, 19, 30)
timezone = pytz.timezone('Europe/Paris')
date_delayed = timezone.localize(date_delayed)
schedule = rrule('SECONDLY', interval=5, count=3, dtstart=date_delayed)
entry = Entry('thingo', 'cluster.add_task', schedule, args=(9, 9), app=cluster.app)
entry.save()
Could you try a try_later()
function in your test to tell me if you are having the same problem ? (just like you did with try_now()
)
Thanks for your help,
Regards
Simon