citus
citus copied to clipboard
Distributed deadlock after issuing a simple `UPDATE` statement
Hello, we have been running into a puzzling situation with distributed deadlocks and after confirmation from multiple sources that this shouldn't be occuring in our case I have decided to file this bug report.
The situation currently is that a fairly simple query can sporadically trigger deadlocks on one of our tables, resulting in two possible error messages "canceling the transaction since it was involved in a distributed deadlock"
and "deadlock detected"
.
The problem first surfaced when issuing an UPDATE
(see below for details) in an explicit transaction, followed by a SELECT
that reads from a different table.
But after trying to workaround this issue, it was found out that this can be triggered even by using a single UPDATE
statement in an implicit transaction and also in a transaction after setting SET LOCAL citus.multi_shard_modify_mode TO 'sequential';
.
We managed to trigger this by using different clients as well (our application which uses libpq under the hood and DBeaver).
Citus: 12.1.1 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit PostgreSQL: 16.1 (installed as 16.1, not upgraded from 16.0) + PostGIS 3.4.1 OS: Ubuntu 22.04.3 LTS Installed packages:
postgis-doc/jammy-pgdg,now 3.4.1+dfsg-1.pgdg22.04+1 all [installed,automatic]
postgis/jammy-pgdg,now 3.4.1+dfsg-1.pgdg22.04+1 amd64 [installed]
postgresql-16-citus-12.1/jammy,now 12.1.1.citus-1 amd64 [installed]
postgresql-16-pg-checksums/jammy-pgdg,now 1.1-6.pgdg22.04+1 amd64 [installed]
postgresql-16-postgis-3-scripts/jammy-pgdg,now 3.4.1+dfsg-1.pgdg22.04+1 all [installed,automatic]
postgresql-16-postgis-3/jammy-pgdg,now 3.4.1+dfsg-1.pgdg22.04+1 amd64 [installed,automatic]
postgresql-16/jammy-pgdg,now 16.1-1.pgdg22.04+1 amd64 [installed,automatic]
postgresql-client-16/jammy-pgdg,now 16.1-1.pgdg22.04+1 amd64 [installed,automatic]
postgresql-client-common/jammy-pgdg,now 256.pgdg22.04+1 all [installed,automatic]
postgresql-common/jammy-pgdg,now 256.pgdg22.04+1 all [installed,automatic]
postgresql-postgis-scripts/jammy-pgdg,now 3.4.1+dfsg-1.pgdg22.04+1 all [installed,automatic]
postgresql-postgis/jammy-pgdg,now 3.4.1+dfsg-1.pgdg22.04+1 amd64 [installed,automatic]
Setup: 4 servers: - 1 coordinator - 2 workers (each in separate worker group) - 1 HAProxy (used to loadbalance over the two workers, we are using the query from any node feature for workload) This is more or less an initial testing cluster, so standby replicas are currently not involved
Note that names of tables/columns/database/users have been adjusted, but originally do NOT contain anything that should cause issues (like very long strings).
The query in our normal workload is run in parallel periodically for each row (so each row is updated roughly every 5 seconds), but all of them are scoped down by providing the primary key of the row and the updates for a given PK are never run at the same time. Currently the table is very small (20 rows) and we managed to reproduce this even by calling the query manually without any other workload touching that table.
Query triggering the deadlock:
UPDATE table1 SET offending_column = '2024-01-31T00:00:00Z' where id = ANY-VALID-ID;
Schema of the offending table:
CREATE TABLE public.table1 (
id int8 NOT NULL GENERATED ALWAYS AS IDENTITY,
column1 int2 NOT NULL,
column2 int2 NOT NULL,
offending_column timestamptz NULL,
column3 uuid NULL,
column4 uuid NULL,
column5 int2 NOT NULL DEFAULT 0,
column6 int8 NOT NULL,
column7 int8 NULL,
column8 int2 NOT NULL DEFAULT 0,
column9 text NOT NULL DEFAULT '',
column10 int8 NOT NULL,
CONSTRAINT table1_pk PRIMARY KEY (id),
CONSTRAINT table1_table2_fk FOREIGN KEY (column6) REFERENCES public.table2(id),
CONSTRAINT table1_table3_fk FOREIGN KEY (column10) REFERENCES public.table3(id)
);
SELECT create_reference_table('public.table1');
PostgreSQL log section with deadlocks occuring:
2024-02-01 08:34:09.679 UTC [1912956] LOG: [2024-02-01 08:34:09.679735+00] Distributed deadlock found among the following distributed transactions:
2024-02-01 08:34:09.679 UTC [1912956] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 08:34:09.680 UTC [1912956] LOG: [2024-02-01 08:34:09.680026+00] [DistributedTransactionId: (1, 339134, 2024-02-01 08:34:09.056811+00)] = [WaitsFor transaction numbers: 448301][Backend Query: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)]
2024-02-01 08:34:09.680 UTC [1912956] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 08:34:09.680 UTC [1912956] LOG: [2024-02-01 08:34:09.680304+00] [DistributedTransactionId: (2, 448301, 2024-02-01 08:34:09.063957+00)] = [WaitsFor transaction numbers: 339134]
2024-02-01 08:34:09.680 UTC [1912956] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 08:34:09.680 UTC [1912956] LOG: [2024-02-01 08:34:09.680336+00] Cancelling the following backend to resolve distributed deadlock (transaction number = 339134, pid = 2013523)
2024-02-01 08:34:09.680 UTC [1912956] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 08:34:09.680 UTC [2013523] dbowner@db ERROR: canceling the transaction since it was involved in a distributed deadlock
2024-02-01 08:34:09.680 UTC [2013523] dbowner@db STATEMENT: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
2024-02-01 08:34:09.782 UTC [1912956] LOG: [2024-02-01 08:34:09.782398+00] Distributed deadlock found among the following distributed transactions:
2024-02-01 08:34:09.782 UTC [1912956] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 08:34:09.782 UTC [1912956] LOG: [2024-02-01 08:34:09.782631+00] [DistributedTransactionId: (1, 339135, 2024-02-01 08:34:09.200594+00)] = [WaitsFor transaction numbers: 448302][Backend Query: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)]
2024-02-01 08:34:09.782 UTC [1912956] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 08:34:09.782 UTC [1912956] LOG: [2024-02-01 08:34:09.782896+00] [DistributedTransactionId: (2, 448302, 2024-02-01 08:34:09.239911+00)] = [WaitsFor transaction numbers: 339135,339137]
2024-02-01 08:34:09.782 UTC [1912956] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 08:34:09.783 UTC [1912956] LOG: [2024-02-01 08:34:09.783086+00] [DistributedTransactionId: (1, 339137, 2024-02-01 08:34:09.556843+00)] = [WaitsFor transaction numbers: 339135][Backend Query: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)]
2024-02-01 08:34:09.783 UTC [1912956] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 08:34:09.783 UTC [1912956] LOG: [2024-02-01 08:34:09.783116+00] Cancelling the following backend to resolve distributed deadlock (transaction number = 339137, pid = 2013871)
2024-02-01 08:34:09.783 UTC [1912956] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 08:34:09.783 UTC [2013871] dbowner@db ERROR: canceling the transaction since it was involved in a distributed deadlock
2024-02-01 08:34:09.783 UTC [2013871] dbowner@db STATEMENT: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
2024-02-01 08:34:09.796 UTC [2013865] dbowner@db LOG: duration: 595.865 ms execute s613888: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
2024-02-01 08:34:09.796 UTC [2013865] dbowner@db DETAIL: parameters: $1 = '2024-02-01 08:34:09.099239+00', $2 = '562949953421325'
We are not experiencing any issues with other tables (60+ tables, both reference and distributed) and everything else in the cluster seems to work correctly. I understand that the info provided is very likely not enough to find the issue and would appreciate guidance on collecting more relevant data. It has been recommended to use pg-lock-tracer, so we will attempt to use that to hopefully gain more insight.
Do you have a log containing the "deadlock detected" error too? Afaik that's an error coming from Postgres itself. Which would indicate that it's actually not just a distributed deadlock, but a deadlock on the node itself (which sometimes we detect as a distributed deadlock if our distributed deadlock detection triggers faster than postgres its local deadlock detection).
It's definitely possible for single statement UPDATE statement to cause deadlocks on one node, due to locking rows in different orders: https://stackoverflow.com/a/10246052/2570866
But it indeed sounds like that wouldn't apply to the situation you're describing. If I understand correctly you are not updating multiple rows with one statement?
In general to get to the bottom of these issues its important to know which exact queries deadlock on eachother and why. One way to do that is by disabling the deadlock detection and looking at the queries that block eachother, but that is likely a bad idea on a production system. If you can somehow reproduce issue in a testing environment that would definitely be useful.
I can confirm that we are indeed updating single row with single statement.
This is log from worker 1 containing "deadlock detected", which mentions this occuring on the worker 2 (log below):
2024-02-01 11:06:00.057 UTC [2013870] dbowner@db LOG: duration: 1002.934 ms execute s708573: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
2024-02-01 11:06:00.057 UTC [2013870] dbowner@db DETAIL: parameters: $1 = '2024-02-01 11:05:59.051657+00', $2 = '562949953421320'
2024-02-01 11:06:00.233 UTC [2013864] dbowner@db ERROR: deadlock detected
2024-02-01 11:06:00.233 UTC [2013864] dbowner@db DETAIL: Process 2108878 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108872.
Process 2108872 waits for ExclusiveLock on advisory lock [16388,0,102008,5]; blocked by process 2099179.
Process 2099179 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108878.
2024-02-01 11:06:00.233 UTC [2013864] dbowner@db HINT: See server log for query details.
2024-02-01 11:06:00.233 UTC [2013864] dbowner@db CONTEXT: while executing command on (WORKER2-IP):5432
2024-02-01 11:06:00.233 UTC [2013864] dbowner@db STATEMENT: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
2024-02-01 11:06:00.366 UTC [2013865] dbowner@db ERROR: deadlock detected
2024-02-01 11:06:00.366 UTC [2013865] dbowner@db DETAIL: Process 2108421 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108872.
Process 2108872 waits for ExclusiveLock on advisory lock [16388,0,102008,5]; blocked by process 2099179.
Process 2099179 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108421.
2024-02-01 11:06:00.366 UTC [2013865] dbowner@db HINT: See server log for query details.
2024-02-01 11:06:00.366 UTC [2013865] dbowner@db CONTEXT: while executing command on (WORKER2-IP):5432
2024-02-01 11:06:00.366 UTC [2013865] dbowner@db STATEMENT: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
2024-02-01 11:06:00.558 UTC [2013536] dbowner@db ERROR: deadlock detected
2024-02-01 11:06:00.558 UTC [2013536] dbowner@db DETAIL: Process 2108875 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108872.
Process 2108872 waits for ExclusiveLock on advisory lock [16388,0,102008,5]; blocked by process 2099179.
Process 2099179 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108875.
2024-02-01 11:06:00.558 UTC [2013536] dbowner@db HINT: See server log for query details.
2024-02-01 11:06:00.558 UTC [2013536] dbowner@db CONTEXT: while executing command on (WORKER2-IP):5432
2024-02-01 11:06:00.558 UTC [2013536] dbowner@db STATEMENT: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
2024-02-01 11:06:00.707 UTC [1912956] LOG: [2024-02-01 11:06:00.707694+00] Distributed deadlock found among the following distributed transactions:
2024-02-01 11:06:00.707 UTC [1912956] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.707 UTC [1912956] LOG: [2024-02-01 11:06:00.707935+00] [DistributedTransactionId: (1, 362459, 2024-02-01 11:05:59.200145+00)] = [WaitsFor transaction numbers: 471676,471677,471678,471679][Backend Query: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)]
2024-02-01 11:06:00.707 UTC [1912956] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.708 UTC [1912956] LOG: [2024-02-01 11:06:00.708194+00] [DistributedTransactionId: (2, 471679, 2024-02-01 11:05:59.980572+00)] = [WaitsFor transaction numbers: 471676,471677,471678]
2024-02-01 11:06:00.708 UTC [1912956] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.708 UTC [1912956] LOG: [2024-02-01 11:06:00.708447+00] [DistributedTransactionId: (2, 471678, 2024-02-01 11:05:59.930214+00)] = [WaitsFor transaction numbers: 471676,471677]
2024-02-01 11:06:00.708 UTC [1912956] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.708 UTC [1912956] LOG: [2024-02-01 11:06:00.708698+00] [DistributedTransactionId: (2, 471677, 2024-02-01 11:05:59.891461+00)] = [WaitsFor transaction numbers: 471676]
2024-02-01 11:06:00.708 UTC [1912956] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.708 UTC [1912956] LOG: [2024-02-01 11:06:00.708948+00] [DistributedTransactionId: (2, 471676, 2024-02-01 11:05:59.815823+00)] = [WaitsFor transaction numbers: 362459,362463]
2024-02-01 11:06:00.708 UTC [1912956] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.709 UTC [1912956] LOG: [2024-02-01 11:06:00.709128+00] [DistributedTransactionId: (1, 362463, 2024-02-01 11:05:59.71205+00)] = [WaitsFor transaction numbers: 362459][Backend Query: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)]
2024-02-01 11:06:00.709 UTC [1912956] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.709 UTC [1912956] LOG: [2024-02-01 11:06:00.709161+00] Cancelling the following backend to resolve distributed deadlock (transaction number = 362463, pid = 2014278)
2024-02-01 11:06:00.709 UTC [1912956] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.709 UTC [2014278] dbowner@db ERROR: canceling the transaction since it was involved in a distributed deadlock
2024-02-01 11:06:00.709 UTC [2014278] dbowner@db STATEMENT: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
Worker 2 log:
2024-02-01 11:06:00.052 UTC [2036734] dbowner@db ERROR: deadlock detected
2024-02-01 11:06:00.052 UTC [2036734] dbowner@db DETAIL: Process 2036734 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108890.
Process 2108890 waits for ExclusiveLock on advisory lock [16388,0,102008,5]; blocked by process 2036734.
Process 2036734: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
Process 2108890: SELECT lock_shard_resources(7, ARRAY[102008, 102308, 102745, 102747])
2024-02-01 11:06:00.052 UTC [2036734] dbowner@db HINT: See server log for query details.
2024-02-01 11:06:00.052 UTC [2036734] dbowner@db STATEMENT: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
2024-02-01 11:06:00.052 UTC [2108890] dbowner@db LOG: duration: 1000.283 ms statement: SELECT lock_shard_resources(7, ARRAY[102008, 102308, 102745, 102747])
2024-02-01 11:06:00.058 UTC [2108872] dbowner@db LOG: duration: 860.117 ms statement: SELECT lock_shard_resources(7, ARRAY[102074])
2024-02-01 11:06:00.230 UTC [2108878] dbowner@db ERROR: deadlock detected
2024-02-01 11:06:00.230 UTC [2108878] dbowner@db DETAIL: Process 2108878 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108872.
Process 2108872 waits for ExclusiveLock on advisory lock [16388,0,102008,5]; blocked by process 2099179.
Process 2099179 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108878.
Process 2108878: SELECT lock_shard_resources(7, ARRAY[102074])
Process 2108872: SELECT lock_shard_resources(7, ARRAY[102008, 102308, 102745, 102747])
Process 2099179: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
2024-02-01 11:06:00.230 UTC [2108878] dbowner@db HINT: See server log for query details.
2024-02-01 11:06:00.230 UTC [2108878] dbowner@db STATEMENT: SELECT lock_shard_resources(7, ARRAY[102074])
2024-02-01 11:06:00.363 UTC [2108421] dbowner@db ERROR: deadlock detected
2024-02-01 11:06:00.363 UTC [2108421] dbowner@db DETAIL: Process 2108421 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108872.
Process 2108872 waits for ExclusiveLock on advisory lock [16388,0,102008,5]; blocked by process 2099179.
Process 2099179 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108421.
Process 2108421: SELECT lock_shard_resources(7, ARRAY[102074])
Process 2108872: SELECT lock_shard_resources(7, ARRAY[102008, 102308, 102745, 102747])
Process 2099179: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
2024-02-01 11:06:00.363 UTC [2108421] dbowner@db HINT: See server log for query details.
2024-02-01 11:06:00.363 UTC [2108421] dbowner@db STATEMENT: SELECT lock_shard_resources(7, ARRAY[102074])
2024-02-01 11:06:00.555 UTC [2108875] dbowner@db ERROR: deadlock detected
2024-02-01 11:06:00.555 UTC [2108875] dbowner@db DETAIL: Process 2108875 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108872.
Process 2108872 waits for ExclusiveLock on advisory lock [16388,0,102008,5]; blocked by process 2099179.
Process 2099179 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108875.
Process 2108875: SELECT lock_shard_resources(7, ARRAY[102074])
Process 2108872: SELECT lock_shard_resources(7, ARRAY[102008, 102308, 102745, 102747])
Process 2099179: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
2024-02-01 11:06:00.555 UTC [2108875] dbowner@db HINT: See server log for query details.
2024-02-01 11:06:00.555 UTC [2108875] dbowner@db STATEMENT: SELECT lock_shard_resources(7, ARRAY[102074])
2024-02-01 11:06:00.708 UTC [1931701] LOG: [2024-02-01 11:06:00.708931+00] Distributed deadlock found among the following distributed transactions:
2024-02-01 11:06:00.708 UTC [1931701] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.709 UTC [1931701] LOG: [2024-02-01 11:06:00.709062+00] [DistributedTransactionId: (2, 471678, 2024-02-01 11:05:59.930214+00)] = [WaitsFor transaction numbers: 471676,471677][Backend Query: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)]
2024-02-01 11:06:00.709 UTC [1931701] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.709 UTC [1931701] LOG: [2024-02-01 11:06:00.709148+00] [DistributedTransactionId: (2, 471677, 2024-02-01 11:05:59.891461+00)] = [WaitsFor transaction numbers: 471676][Backend Query: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)]
2024-02-01 11:06:00.709 UTC [1931701] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.709 UTC [1931701] LOG: [2024-02-01 11:06:00.709388+00] [DistributedTransactionId: (2, 471676, 2024-02-01 11:05:59.815823+00)] = [WaitsFor transaction numbers: 362459,362463][Backend Query: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)]
2024-02-01 11:06:00.709 UTC [1931701] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.709 UTC [2108874] dbowner@db ERROR: canceling statement due to user request
2024-02-01 11:06:00.709 UTC [2108874] dbowner@db STATEMENT: SELECT lock_shard_resources(7, ARRAY[102074])
2024-02-01 11:06:00.709 UTC [1931701] LOG: [2024-02-01 11:06:00.70964+00] [DistributedTransactionId: (1, 362463, 2024-02-01 11:05:59.71205+00)] = [WaitsFor transaction numbers: 362459]
2024-02-01 11:06:00.709 UTC [1931701] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.709 UTC [1931701] LOG: [2024-02-01 11:06:00.709894+00] [DistributedTransactionId: (1, 362459, 2024-02-01 11:05:59.200145+00)] = [WaitsFor transaction numbers: 471676,471677,471678,471679]
2024-02-01 11:06:00.709 UTC [1931701] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.710 UTC [1931701] LOG: [2024-02-01 11:06:00.709997+00] [DistributedTransactionId: (2, 471679, 2024-02-01 11:05:59.980572+00)] = [WaitsFor transaction numbers: 471676,471677,471678][Backend Query: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)]
2024-02-01 11:06:00.710 UTC [1931701] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.710 UTC [1931701] LOG: [2024-02-01 11:06:00.710028+00] Cancelling the following backend to resolve distributed deadlock (transaction number = 471679, pid = 2036746)
2024-02-01 11:06:00.710 UTC [1931701] CONTEXT: Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.710 UTC [2036746] dbowner@db ERROR: canceling the transaction since it was involved in a distributed deadlock
2024-02-01 11:06:00.710 UTC [2036746] dbowner@db STATEMENT: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
As for disabling deadlock detection, it should be possible to do on current cluster but will have to schedule some time window for that and is likely to take a few days.
Hmm, okay the exclusive lock on the advisory lock suggests that this is related to the way we serialize updates to reference tables. @agedemenli can you take a closer look?
create table table2 (id int primary key);
select create_reference_table('table2');
alter table table2 add constraint self_fkey foreign key (id) references table2(id);
insert into table2 values (1);
create table table3 (id int primary key);
SELECT create_reference_table('table3');
insert into table3 values (1);
CREATE TABLE public.table1 (
id int8 NOT NULL GENERATED ALWAYS AS IDENTITY,
column1 int2 NOT NULL,
column2 int2 NOT NULL,
offending_column timestamptz NULL,
column3 uuid NULL,
column4 uuid NULL,
column5 int2 NOT NULL DEFAULT 0,
column6 int8 NOT NULL,
column7 int8 NULL,
column8 int2 NOT NULL DEFAULT 0,
column9 text NOT NULL DEFAULT '',
column10 int8 NOT NULL,
CONSTRAINT table1_pk PRIMARY KEY (id),
CONSTRAINT table1_table2_fk FOREIGN KEY (column6) REFERENCES public.table2(id),
CONSTRAINT table1_table3_fk FOREIGN KEY (column10) REFERENCES public.table3(id)
);
SELECT create_reference_table('public.table1');
insert into table1 values (default,1,1,null,'1740a945-d9a1-4bf9-ba48-50620b5109ed','1740
a945-d9a1-4bf9-ba48-50620b5109ed',1,1,1,1,'str',1);
I'll run this insert statement frequently for a while to see if I can reproduce it
After disabling deadlock detection, we can observe that there are indeed several transactions that are completely stuck. All of them have the same exact UPDATE
(with different values).
Below is the output from pg_locks
from both workers, filtered out to include only locks that have PID matching to the queries that are stuck.
locktype |database|relation|page|tuple|virtualxid|transactionid|classid|objid |objsubid|virtualtransaction|pid |mode |granted|fastpath|waitstart |
----------+--------+--------+----+-----+----------+-------------+-------+------+--------+------------------+-------+----------------+-------+--------+-----------------------------+
relation | 16388| 16518| | | | | | | |48/211263 |2466353|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |48/211263 |2466353|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |48/211263 |2466353|RowExclusiveLock|true |true | |
virtualxid| | | | |48/211263 | | | | |48/211263 |2466353|ExclusiveLock |true |true | |
relation | 16388| 16518| | | | | | | |62/75245 |2466365|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |62/75245 |2466365|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |62/75245 |2466365|RowExclusiveLock|true |true | |
virtualxid| | | | |62/75245 | | | | |62/75245 |2466365|ExclusiveLock |true |true | |
relation | 16388| 16518| | | | | | | |50/283761 |2466356|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |50/283761 |2466356|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |50/283761 |2466356|RowExclusiveLock|true |true | |
virtualxid| | | | |50/283761 | | | | |50/283761 |2466356|ExclusiveLock |true |true | |
relation | 16388| 16518| | | | | | | |39/144041 |2466348|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |39/144041 |2466348|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |39/144041 |2466348|RowExclusiveLock|true |true | |
virtualxid| | | | |39/144041 | | | | |39/144041 |2466348|ExclusiveLock |true |true | |
relation | 16388| 16518| | | | | | | |68/68793 |2466369|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |68/68793 |2466369|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |68/68793 |2466369|RowExclusiveLock|true |true | |
virtualxid| | | | |68/68793 | | | | |68/68793 |2466369|ExclusiveLock |true |true | |
relation | 16388| 16518| | | | | | | |58/91978 |2466358|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |58/91978 |2466358|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |58/91978 |2466358|RowExclusiveLock|true |true | |
virtualxid| | | | |58/91978 | | | | |58/91978 |2466358|ExclusiveLock |true |true | |
relation | 16388| 16518| | | | | | | |47/431707 |2466352|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |47/431707 |2466352|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |47/431707 |2466352|RowExclusiveLock|true |true | |
virtualxid| | | | |47/431707 | | | | |47/431707 |2466352|ExclusiveLock |true |true | |
relation | 16388| 16518| | | | | | | |77/5719 |2466373|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |77/5719 |2466373|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |77/5719 |2466373|RowExclusiveLock|true |true | |
virtualxid| | | | |77/5719 | | | | |77/5719 |2466373|ExclusiveLock |true |true | |
relation | 16388| 16518| | | | | | | |40/362508 |2466349|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |40/362508 |2466349|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |40/362508 |2466349|RowExclusiveLock|true |true | |
virtualxid| | | | |40/362508 | | | | |40/362508 |2466349|ExclusiveLock |true |true | |
relation | 16388| 16518| | | | | | | |63/336885 |2466366|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |63/336885 |2466366|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |63/336885 |2466366|RowExclusiveLock|true |true | |
virtualxid| | | | |63/336885 | | | | |63/336885 |2466366|ExclusiveLock |true |true | |
relation | 16388| 16518| | | | | | | |72/42189 |2466370|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |72/42189 |2466370|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |72/42189 |2466370|RowExclusiveLock|true |true | |
virtualxid| | | | |72/42189 | | | | |72/42189 |2466370|ExclusiveLock |true |true | |
advisory | 16388| | | | | | 0|102074| 4|40/362508 |2466349|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102008| 5|68/68793 |2466369|ExclusiveLock |false |false |2024-02-05 10:32:37.854 +0100|
advisory | 16388| | | | | | 0|102745| 5|39/144041 |2466348|ExclusiveLock |true |false | |
advisory | 16388| | | | | | 0|102074| 4|58/91978 |2466358|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102008| 5|47/431707 |2466352|ExclusiveLock |false |false |2024-02-05 10:32:27.837 +0100|
advisory | 16388| | | | | | 0|102008| 5|48/211263 |2466353|ExclusiveLock |false |false |2024-02-05 10:33:04.702 +0100|
advisory | 16388| | | | | | 0|102074| 4|50/283761 |2466356|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102074| 4|63/336885 |2466366|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102008| 5|39/144041 |2466348|ExclusiveLock |true |false | |
advisory | 16388| | | | | | 0|102008| 5|72/42189 |2466370|ExclusiveLock |false |false |2024-02-05 10:32:32.847 +0100|
advisory | 16388| | | | | | 0|102008| 5|77/5719 |2466373|ExclusiveLock |false |false |2024-02-05 10:32:22.824 +0100|
advisory | 16388| | | | | | 0|102074| 4|62/75245 |2466365|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102008| 5|62/75245 |2466365|ExclusiveLock |false |false |2024-02-05 10:32:47.873 +0100|
advisory | 16388| | | | | | 0|102074| 4|77/5719 |2466373|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102074| 5|39/144041 |2466348|ExclusiveLock |false |false |2024-02-05 10:33:02.948 +0100|
advisory | 16388| | | | | | 0|102074| 4|72/42189 |2466370|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102747| 5|39/144041 |2466348|ExclusiveLock |true |false | |
advisory | 16388| | | | | | 0|102008| 5|63/336885 |2466366|ExclusiveLock |false |false |2024-02-05 10:32:42.862 +0100|
advisory | 16388| | | | | | 0|102308| 5|39/144041 |2466348|ExclusiveLock |true |false | |
advisory | 16388| | | | | | 0|102074| 4|39/144041 |2466348|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102074| 4|47/431707 |2466352|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102074| 4|48/211263 |2466353|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102008| 5|50/283761 |2466356|ExclusiveLock |false |false |2024-02-05 10:32:57.900 +0100|
advisory | 16388| | | | | | 0|102008| 5|58/91978 |2466358|ExclusiveLock |false |false |2024-02-05 10:32:52.882 +0100|
advisory | 16388| | | | | | 0|102074| 4|68/68793 |2466369|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102008| 5|40/362508 |2466349|ExclusiveLock |false |false |2024-02-05 10:32:17.813 +0100|
locktype |database|relation|page|tuple|virtualxid|transactionid|classid|objid |objsubid|virtualtransaction|pid |mode |granted|fastpath|waitstart|
----------+--------+--------+----+-----+----------+-------------+-------+------+--------+------------------+-------+----------------+-------+--------+---------+
relation | 16388| 16518| | | | | | | |63/99368 |2412085|AccessShareLock |true |true | |
relation | 16388| 16518| | | | | | | |63/99368 |2412085|RowShareLock |true |true | |
relation | 16388| 18918| | | | | | | |63/99368 |2412085|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |63/99368 |2412085|RowExclusiveLock|true |true | |
virtualxid| | | | |63/99368 | | | | |63/99368 |2412085|ExclusiveLock |true |true | |
relation | 16388| 16596| | | | | | | |53/135275 |2412077|AccessShareLock |true |true | |
relation | 16388| 16455| | | | | | | |53/135275 |2412077|AccessShareLock |true |true | |
relation | 16388| 16518| | | | | | | |53/135275 |2412077|AccessShareLock |true |true | |
relation | 16388| 16518| | | | | | | |53/135275 |2412077|RowShareLock |true |true | |
relation | 16388| 18918| | | | | | | |53/135275 |2412077|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |53/135275 |2412077|RowExclusiveLock|true |true | |
virtualxid| | | | |53/135275 | | | | |53/135275 |2412077|ExclusiveLock |true |true | |
relation | 16388| 16518| | | | | | | |56/340870 |2412080|AccessShareLock |true |true | |
relation | 16388| 16518| | | | | | | |56/340870 |2412080|RowShareLock |true |true | |
relation | 16388| 18918| | | | | | | |56/340870 |2412080|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |56/340870 |2412080|RowExclusiveLock|true |true | |
virtualxid| | | | |56/340870 | | | | |56/340870 |2412080|ExclusiveLock |true |true | |
relation | 16388| 16518| | | | | | | |60/198745 |2412083|AccessShareLock |true |true | |
relation | 16388| 16518| | | | | | | |60/198745 |2412083|RowShareLock |true |true | |
relation | 16388| 18918| | | | | | | |60/198745 |2412083|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |60/198745 |2412083|RowExclusiveLock|true |true | |
virtualxid| | | | |60/198745 | | | | |60/198745 |2412083|ExclusiveLock |true |true | |
relation | 16388| 16518| | | | | | | |65/111224 |2412086|AccessShareLock |true |true | |
relation | 16388| 16518| | | | | | | |65/111224 |2412086|RowShareLock |true |true | |
relation | 16388| 18918| | | | | | | |65/111224 |2412086|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |65/111224 |2412086|RowExclusiveLock|true |true | |
virtualxid| | | | |65/111224 | | | | |65/111224 |2412086|ExclusiveLock |true |true | |
relation | 16388| 16518| | | | | | | |68/81351 |2412087|AccessShareLock |true |true | |
relation | 16388| 16518| | | | | | | |68/81351 |2412087|RowShareLock |true |true | |
relation | 16388| 18918| | | | | | | |68/81351 |2412087|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |68/81351 |2412087|RowExclusiveLock|true |true | |
virtualxid| | | | |68/81351 | | | | |68/81351 |2412087|ExclusiveLock |true |true | |
relation | 16388| 16518| | | | | | | |44/136070 |2411900|AccessShareLock |true |true | |
relation | 16388| 16518| | | | | | | |44/136070 |2411900|RowShareLock |true |true | |
relation | 16388| 18918| | | | | | | |44/136070 |2411900|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |44/136070 |2411900|RowExclusiveLock|true |true | |
virtualxid| | | | |44/136070 | | | | |44/136070 |2411900|ExclusiveLock |true |true | |
relation | 16388| 16518| | | | | | | |54/346888 |2412078|AccessShareLock |true |true | |
relation | 16388| 16518| | | | | | | |54/346888 |2412078|RowShareLock |true |true | |
relation | 16388| 18918| | | | | | | |54/346888 |2412078|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |54/346888 |2412078|RowExclusiveLock|true |true | |
virtualxid| | | | |54/346888 | | | | |54/346888 |2412078|ExclusiveLock |true |true | |
relation | 16388| 16518| | | | | | | |58/108409 |2412081|AccessShareLock |true |true | |
relation | 16388| 16518| | | | | | | |58/108409 |2412081|RowShareLock |true |true | |
relation | 16388| 18918| | | | | | | |58/108409 |2412081|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |58/108409 |2412081|RowExclusiveLock|true |true | |
virtualxid| | | | |58/108409 | | | | |58/108409 |2412081|ExclusiveLock |true |true | |
relation | 16388| 16518| | | | | | | |37/207814 |2411853|AccessShareLock |true |true | |
relation | 16388| 16518| | | | | | | |37/207814 |2411853|RowShareLock |true |true | |
relation | 16388| 18918| | | | | | | |37/207814 |2411853|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |37/207814 |2411853|RowExclusiveLock|true |true | |
virtualxid| | | | |37/207814 | | | | |37/207814 |2411853|ExclusiveLock |true |true | |
relation | 16388| 16518| | | | | | | |59/106674 |2412082|AccessShareLock |true |true | |
relation | 16388| 16518| | | | | | | |59/106674 |2412082|RowShareLock |true |true | |
relation | 16388| 18918| | | | | | | |59/106674 |2412082|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |59/106674 |2412082|RowExclusiveLock|true |true | |
virtualxid| | | | |59/106674 | | | | |59/106674 |2412082|ExclusiveLock |true |true | |
relation | 16388| 16518| | | | | | | |69/71340 |2412088|AccessShareLock |true |true | |
relation | 16388| 16518| | | | | | | |69/71340 |2412088|RowShareLock |true |true | |
relation | 16388| 18918| | | | | | | |69/71340 |2412088|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |69/71340 |2412088|RowExclusiveLock|true |true | |
virtualxid| | | | |69/71340 | | | | |69/71340 |2412088|ExclusiveLock |true |true | |
relation | 16388| 16518| | | | | | | |55/341694 |2412079|AccessShareLock |true |true | |
relation | 16388| 16518| | | | | | | |55/341694 |2412079|RowShareLock |true |true | |
relation | 16388| 18918| | | | | | | |55/341694 |2412079|AccessShareLock |true |true | |
relation | 16388| 18918| | | | | | | |55/341694 |2412079|RowExclusiveLock|true |true | |
virtualxid| | | | |55/341694 | | | | |55/341694 |2412079|ExclusiveLock |true |true | |
advisory | 16388| | | | | | 0|102074| 4|59/106674 |2412082|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102074| 4|58/108409 |2412081|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102074| 4|69/71340 |2412088|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102074| 4|53/135275 |2412077|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102074| 4|68/81351 |2412087|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102074| 4|55/341694 |2412079|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102074| 4|37/207814 |2411853|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102074| 4|56/340870 |2412080|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102074| 4|54/346888 |2412078|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102074| 4|44/136070 |2411900|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102074| 4|60/198745 |2412083|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102074| 4|63/99368 |2412085|ShareLock |true |false | |
advisory | 16388| | | | | | 0|102074| 4|65/111224 |2412086|ShareLock |true |false | |
Thanks for the lock information!
I still can't reproduce the issue and the best way to debug it is actually first reproducing it. Any detail that helps with the reproduction would be appreciated. Could you please take a look at the schema snippet I shared above? Just in case you have an improvement suggestion to make it work.
Unfortunately I don't have more that could help with reproduction right now, as I am unable to reproduce this on an another cluster with the same schema either.
But I have looked into the locks that are leading into the deadlock and one thing that seems very suspicious are the advisory
ExclusiveLock
s. Specifically, they lock not only the entire table2
(for which there is no point AFAIK, because we aren't touching that table in any way), but also other tables that have foreign keys in table2
(simple reference tables that represent enums as mentioned on Slack).
Assuming I understand these types of locks correctly, the simple UPDATE
leads to table level locks on unrelated tables which allow only operations originating from that transaction (not taking into account that locks are only advisory, so not sure how Citus interacts with them).
Could you confirm that this is NOT the intended locking behavior? Is there a way to extract information from Citus that could explain why these locks are taken? Perhaps this information would allow us to better reproduce this issue.
It would be possible to grant an access to the faulty cluster temporarily if it would help with investigation of this issue. Let me know if this is something worth doing or if local reproduction is needed to continue.
@lkral-navmatix
Okay we were able to reproduce locally and find a minimal reproducing example
(see below, needs PG16 for load_balance_hosts
in pgbench
). So no need for
access to your cluster. This is definitely a bug. We'll assign someone to fix
it, but depending on the difficulty that might take a while. Luckily there is a
pretty simple workaround.
How to workaround
Don't run this problematic update query through your loadbalancer, but only run it on a specific single node (coordinator is probably easiest, but any of the workers would be fine too).
How to reproduce
Create two files
issue-7477.sql:
create table table2 (id int primary key);
select create_reference_table('table2');
insert into table2 values (1);
CREATE TABLE public.table1 (
id int8 NOT NULL,
offending_column timestamptz NULL,
column6 int8 NOT NULL,
CONSTRAINT table1_table2_fk FOREIGN KEY (column6) REFERENCES public.table2(id)
);
SELECT create_reference_table('public.table1');
insert into table1 values (1,null,1);
update.sql:
UPDATE table1 SET offending_column = '2024-01-31T00:00:00Z' where id = 1;
Then run:
./citus_dev make test --destroy --init-with issue-7477.sql
pgbench -f update.sql -c 10 -j 2 -T 200 -P 1 'host=localhost,localhost port=9701,9702 load_balance_hosts=random'
Worker logs are then full of deadlock issues.
@JelteF
That is great to hear, good job!
We will verify whether the workaround indeed works for us and will await future version with the fix.
Thank you for the cooperation.
I can confirm that we have not experienced any more deadlocks since applying the workaround.
Thank you.
The order in which the locks are acquired depends on the worker initiating the UPDATE statement (first worker or others), which causes the deadlock.