yii2-queue icon indicating copy to clipboard operation
yii2-queue copied to clipboard

Redis driver reserved id repush to wating list before delete

Open sanwv opened this issue 5 years ago • 9 comments

What steps will reproduce the problem?

  • set ttr=5 , one job sleep 10s
  • use supervisor start 10 worker to handle jobs
  • push 100 job to queue sometimes will error, when one job moveExpired before delete

What's expected?

What do you get instead?

multiprocess need data in time sequence, i just give $queue$.reserved item a greater score than time()+ttr , a temporary situation

Additional info

Q A
Yii version
PHP version
Operating system

sanwv avatar Mar 27 '20 09:03 sanwv

when error_reporting(0) the error msg : job message data been deleted , cant resolve ttr from the serialize data

Exception 'TypeError' with message 'Argument 5 passed to Symfony\Component\Process\Process::__construct() must be of the type float or null, string given, called in /var/www/test/vendor/yiisoft/yii2-queue/src/cli/Command.php on line 185'

in /var/www/test/vendor/symfony/process/Process.php:140

sanwv avatar Mar 27 '20 09:03 sanwv

$ttr passed at https://github.com/yiisoft/yii2-queue/blob/master/src/cli/Command.php#L185 is a string while it should be int. Any idea why?

samdark avatar Mar 28 '20 23:03 samdark

the main point is retry a job and the job timeout be kill is all same time(after ttr seconds), the two event is standalone on multiprocess(retry maybe before job timeout or job timeout before retry)

if a retry before the job be kill, the retry job maybe get empty message from $queue$.message and break,in milliseconds

  • a timeout job id from *.reserved repush to *.waiting(in A worker)
  • the job just be kill after repush some milliseconds and invoke delete, the job data in *.messages deleted (in B worker)
  • the job id pop from *.waiting and continue execution, fetch data from *.messages. damn!where is my data! ^_...
  • the $ttr and $meesage is null ,pass to Symfony Process, the process become scapegoat ^.^

sanwv avatar Mar 29 '20 02:03 sanwv

more possible to happen when a great many worker and job execution time more than ttr

in fact, can not ensure a job always finish in ttr seconds

sanwv avatar Mar 29 '20 03:03 sanwv

How would you solve it, @sanwv?

samdark avatar Mar 30 '20 10:03 samdark

$this->redis->zadd("$this->channel.reserved", time() + $ttr+60, $id);

or

 'queue' => [
            'ttr' => 86400*30
],

a little funny, just a temporary situation

sanwv avatar Mar 31 '20 03:03 sanwv

Got the same problem, waiting for official solution

Roland-Zhu avatar Mar 31 '20 03:03 Roland-Zhu

$ttr passed at https://github.com/yiisoft/yii2-queue/blob/master/src/cli/Command.php#L185 is a string while it should be int. Any idea why?

when repush to waiting list (line 137) before delete message (line 182), so will get id (line 145) but lost payload (line 151) , $payload=NULL and $ttr=""

the problem cause fatal error, the worker progress break off

https://github.com/yiisoft/yii2-queue/blob/master/src/drivers/redis/Queue.php#L137 https://github.com/yiisoft/yii2-queue/blob/master/src/drivers/redis/Queue.php#L182 https://github.com/yiisoft/yii2-queue/blob/master/src/drivers/redis/Queue.php#L145 https://github.com/yiisoft/yii2-queue/blob/master/src/drivers/redis/Queue.php#L151

I'm not sure if other reasons also cause this issue, such as redis lru policy or line 145 and line 151 is not atomic

sanwv avatar Apr 25 '22 12:04 sanwv

related #218 #312 when many workes and job run time longer than ttr, job retry not set, handleMessage (line 61 ) get true and delete message. delete message and moveExpired (line 137 ) run simultaneously in two progress, when moveExpired before than delete message will cause the issue.

sanwv avatar Apr 25 '22 14:04 sanwv