taskiq
taskiq copied to clipboard
Slow performance in RedisScheduleSource.get_schedules() leads to missed scheduled jobs
I added print statements to measure how long get_schedules() takes and noticed that it sometimes takes over 2 minutes to complete. This causes the "loop that runs every 1 minute" in run_scheduler_loop to miss some intervals.
In the get_schedules() method, this line seems to be causing the issue:
async for key in redis.scan_iter(f"{self.prefix}:*"):
scan_iter appears to never yield any keys.
After debugging, I found that AsyncScanCommands.scan_iter() looks like this:
AsyncScanCommands.scan_iter(self, ...):
cursor = "0"
while cursor != 0:
cursor, data = await self.scan(
cursor=cursor, match=match, count=count, _type=_type, **kwargs
)
for d in data:
yield d
The issue seems to be that self.scan() returns a non-zero cursor but with empty data, causing the loop to never terminate.
Try to use ListRedisScheduleSource instead of RedisScheduleSource:
RedisScheduleSourceis deprecated and will be removed in future releases;- As far as I know
ListRedisScheduleSourcehas performance improvements in comparison withRedisScheduleSource.