Bug: Race condition when datastore worker jobs processes large file in quick succession with active usage
Version:1.0.1 tweaked version 1.0.1-qgov.6 CKAN Version:CKAN 2.10.1
We have been seeing on latest Xloader system that recent changes seem to be causing deadlocks.
We have been able to resolve this issue by either killing the dead locked worker (or waiting for it to timeout) or via pgadmin or similar to kill the locking pid.
If anyone has an idea on how to make it so that we don't dead lock the database which then causes a major cpu spike since the 3 queries is now waiting on each other.
| backend type | start | last update | pid | blocking pid | wait event | db | sql |
|---|---|---|---|---|---|---|---|
| client backend | 2023-11-30 23:13:57 UTC | 2023-11-30 23:13:57 UTC | 16502 | 16493 | Lock: relation | user_datastore | SELECT pg_indexes_size('775e9be7-eecd-47d6-b72f-18c3cebc137d') |
| client backend | 2023-11-30 23:13:55 UTC | 2023-11-30 23:13:57 UTC | 16493 | 14752 | Lock: relation | user_datastore | DROP TABLE "775e9be7-eecd-47d6-b72f-18c3cebc137d" CASCADE |
| client backend | 2023-11-30 23:09:17 UTC | 2023-11-30 23:13:56 UTC | 4752 | Client: ClientRead | user_datastore | SELECT count(_id) FROM "775e9be7-eecd-47d6-b72f-18c3cebc137d" |
more information: https://stackoverflow.com/questions/32145189/avoid-exclusive-access-locks-on-referenced-tables-when-dropping-in-postgresql
@ThrawnCA this can be closed when https://github.com/ckan/ckanext-xloader/commit/e6687a280aafc3c2fdba367fa04c290df3ac04dd is include in ckan/ckanext-xloader.