qudo icon indicating copy to clipboard operation
qudo copied to clipboard

Fix job dequeue order with explicit ORDER BY

Open Etsukata opened this issue 7 years ago • 4 comments

Current dequeue query does not use ORDER BY to guarantee the order of jobs dequeued by workers. But SQL standards does not specify any behavior of record order without ORDER BY and it depends on implementations of RDBMS. So jobs can be dequeued in descending order of enqueue_time, that means jobs are treated in Stack(FILO) order rather than Queue.


As for my environment, MySQL 5.6 uses priority index(can be varied by optimizer depends on INNODB statistics), then jobs are dequeued in FILO order. You can reproduce this behavior by adding FORCE INDEX (priority) to dequeue query.

ex:

> explain select   job.id,   job.arg,   job.uniqkey,   job.func_id,   job.grabbed_until,   job.retry_cnt,   job.priority from   job  where   job.func_id = 10 and job.grabbed_until <= unix_timestamp() and job.run_after <= unix_timestamp() order by   job.priority DESC;
+----+-------------+-------+-------+---------------+----------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key      | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+----------+---------+------+------+-------------+
|  1 | SIMPLE      | job   | index | NULL          | priority | 4       | NULL |    5 | Using where |
+----+-------------+-------+-------+---------------+----------+---------+------+------+-------------+

> select   job.id,   job.arg,   job.uniqkey,   job.func_id,   job.grabbed_until,   job.retry_cnt,   job.priority from   job  where   job.func_id = 10 and job.grabbed_until <= unix_timestamp() and job.run_after <= unix_timestamp() order by   job.priority DESC;
+---------+------+---------+---------+---------------+-----------+----------+
| id      | arg  | uniqkey | func_id | grabbed_until | retry_cnt | priority |
+---------+------+---------+---------+---------------+-----------+----------+
| 1771333 | NULL | c       |      10 |             0 |         0 |        5 |
| 1771332 | NULL | b       |      10 |             0 |         0 |        5 |
| 1771331 | NULL | a       |      10 |             0 |         0 |        5 |
+---------+------+---------+---------+---------------+-----------+----------+

Etsukata avatar Jun 06 '17 07:06 Etsukata

Hi @karupanerura

Taking another look at codes, I came to think that the ORDER BY should use the column job.enqueue_time instead of job.id because reenqueue updates enqueue_time. How do you think about it?

Etsukata avatar Jun 19 '17 03:06 Etsukata

@Etsukata Thank you for your report. I'm still thinking and researching about performance. Please wait for my response more few days, please.

karupanerura avatar Jun 20 '17 06:06 karupanerura

Ref: http://dbz-dokkan.bngames.net/info1116.html

Etsukata avatar Dec 14 '17 02:12 Etsukata

@Etsukata I'm sorry for the late reply. I think ORDER BY job.priority DESC, job.enqueue_time ASC is a good idea for the fix.

But, I think there is a performance issue when many Qudo jobs are created. This query will be planned as filesort because this sort of condition cannot be resolved by index only. So this query will grab many CPU resources on MySQL when many jobs are available. (it includes future executable jobs: the grabbed_until may be changed by run_after https://github.com/nekokak/qudo/blob/master/lib/Qudo/Manager.pm#L157, so this case could be happend easy) And This change affects many existing systems implicitly. So I think this change should be enabled optionally and explicitly. (e.g. flags?)

How do you think about it?

karupanerura avatar Jun 21 '19 23:06 karupanerura