yii2-queue icon indicating copy to clipboard operation
yii2-queue copied to clipboard

Fix deadlock problem

Open mathematicalman opened this issue 4 years ago • 18 comments

Fix 'Has not waited the lock' error. Tested at very high loads.

Q A
Is bugfix? yes
New feature? no
Breaks BC? no
Tests pass? yes/no
Fixed issues #234

mathematicalman avatar Nov 29 '19 12:11 mathematicalman

Can you explain how this is related to #234?

rob006 avatar Dec 01 '19 11:12 rob006

Can you explain how this is related to #234?

"Has not waited the lock" problem appears there and there - https://github.com/yiisoft/yii2-queue/issues/245 . Wrapping release() in mutex lock and changing sql query of moveExpired() solve this problem.

mathematicalman avatar Dec 02 '19 09:12 mathematicalman

#234 is about reserving lock - adding additional lock on releasing will only make things worse (more occasions to acquire lock failure). And it is unclear what is the point of such lock since you just ignore failure and continue without lock. #245 was fixed by exact opposite change: https://github.com/yiisoft/yii2-queue/commit/eb8c7d4c635ae9da1c09e4fb3ae41452e33bed8b

rob006 avatar Dec 02 '19 14:12 rob006

#234 is about reserving lock - adding additional lock on releasing will only make things worse (more occasions to acquire lock failure). And it is unclear what is the point of such lock since you just ignore failure and continue without lock. #245 was fixed by exact opposite change: eb8c7d4

Mutex does not slow down much. #245 was closed but not fully fixed.

mathematicalman avatar Dec 02 '19 18:12 mathematicalman

i think the mutex part of your code is wrong, as it is now it does not block execution in case mutex is not aquired.

@see https://www.yiiframework.com/doc/api/2.0/yii-mutex-mutex

fl0v avatar May 27 '20 03:05 fl0v

@fl0v do you mean MySQL driver of mutex extension? If so, it should be fixed there instead of in queue, right?

samdark avatar May 27 '20 09:05 samdark

$mutex = $this->mutex->acquire(__CLASS__ . $this->channel, $this->mutexTimeout); should handle if $mutex is false

fl0v avatar May 27 '20 09:05 fl0v

@samdark this is still an issue, will you merge the PR into next release?

hunwalk avatar May 30 '22 11:05 hunwalk

@hunwalk solution provided isn't finished.

samdark avatar May 31 '22 15:05 samdark

We have that Deadlock found when trying to get lockfile over and over again, it would be nice to have that fixed. Or is there any option to fix that without this PR?

nadar avatar Jun 07 '22 06:06 nadar

@nadar does this PR fix it?

samdark avatar Jun 17 '22 17:06 samdark

I have not tested... but i assume @mathematicalman had that problem over and over again (like we have), so he made that PR and wrote Fix 'Has not waited the lock' error. **Tested at very high loads**. - so i assume this should fix that problem :-)

@samdark i have to admit, i think its not the same error, sorry i was not looking close enough - or at least i think its not the same! This is what we have, its about transaction deadlock

Error: SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction
The SQL being executed was: UPDATE `queue` SET `reserved_at`=NULL WHERE `reserved_at` is not null and `reserved_at` < 1655449220 - `ttr` and `done_at` is null

nadar avatar Jun 17 '22 18:06 nadar

@nadar no, that shouldn't be the same thing.

samdark avatar Jun 20 '22 21:06 samdark

@samdark

Well, actually we have both problems :-) Just found out today:

2022-06-30 17:43:35 [pid: 1] - Worker is stopped (1:06:22)
Error: Has not waited the lock.

nadar avatar Jun 30 '22 15:06 nadar

So... does the change from this PR fix the problem?

samdark avatar Jul 01 '22 07:07 samdark

https://github.com/yiisoft/yii2-queue/pull/449#issuecomment-1172955086

gustavovendramini avatar Jul 06 '22 17:07 gustavovendramini

@samdark its "hard" to test that code from a fork via composer in production i have to admit. If it would be branch of yii2 queue it would be easy via dev-XYZ. Anyway. What @gustavovendramini mentioned points me into a interesting direction, because we have another queue connected to the same database on an "older" infrastructure which us running version 2.3.3 and we don't have any Error: Has not waited the lock. errors. So we will try to downgrade as well for the other workers.

nadar avatar Jul 11 '22 06:07 nadar

This problem still persists. I accidently switched to latest version and got may errors, now switched back to 2.3.3.

nadar avatar Sep 21 '22 15:09 nadar

So I just tested this PR on a real queue load. And unfortunately, it gets only worse. The application was unable to push any new jobs because the queue table is always locked. I have no idea, how and why it works, but adding mutexes on the release() method makes it impossible to insert/update the queue table. Some MySQL magic that I haven't studied yet :upside_down_face:

Also, I found a commit https://github.com/yiisoft/yii2-queue/commit/eb8c7d4c635ae9da1c09e4fb3ae41452e33bed8b which was fixing deadlocks somehow by introducing non-optimal query, which I fixed back (https://github.com/yiisoft/yii2-queue/pull/449) and returned deadlocks :upside_down_face: Even more magic.

erickskrauch avatar Oct 27 '22 00:10 erickskrauch

Closing because of such testing results. @erickskrauch thanks for testing it.

samdark avatar Nov 13 '22 18:11 samdark