citus icon indicating copy to clipboard operation
citus copied to clipboard

Foreign keys between reference tables have locking bugs

Open JelteF opened this issue 3 years ago • 2 comments

I ran the isolation_ref2ref_foreign_keys test 500 times on my dev machine and 7 failures happened. 5 of these failures were because a query that was supposed to be blocked wasn't actually blocked. 2 others were because less or more locks were held than expected when looking at the pg_locks view.

diff -dU10 -w /home/jelte/work/citus/src/test/regress/expected/isolation_ref2ref_foreign_keys.out /home/jelte/work/citus/src/test/regress/results/isolation_ref2ref_foreign_keys.out
--- /home/jelte/work/citus/src/test/regress/expected/isolation_ref2ref_foreign_keys.out.modified	2022-08-18 15:12:26.459528512 +0200
+++ /home/jelte/work/citus/src/test/regress/results/isolation_ref2ref_foreign_keys.out.modified	2022-08-18 15:12:26.489528512 +0200
@@ -199,21 +199,21 @@
 
 step s1-view-locks: 
     SELECT mode, count(*)
     FROM pg_locks
     WHERE locktype='advisory'
     GROUP BY mode
     ORDER BY 1, 2;
 
 mode         |count
 -------------+-----
-ExclusiveLock|    3
+ExclusiveLock|    1
 ShareLock    |    1
 (2 rows)
 
 step s1-rollback: 
     ROLLBACK;
 
 step s2-rollback: 
     ROLLBACK;
 
 step s1-view-locks: 
diff -dU10 -w /home/jelte/work/citus/src/test/regress/expected/isolation_ref2ref_foreign_keys.out /home/jelte/work/citus/src/test/regress/results/isolation_ref2ref_foreign_keys.out
--- /home/jelte/work/citus/src/test/regress/expected/isolation_ref2ref_foreign_keys.out.modified	2022-08-18 15:14:24.299528295 +0200
+++ /home/jelte/work/citus/src/test/regress/results/isolation_ref2ref_foreign_keys.out.modified	2022-08-18 15:14:24.309528295 +0200
@@ -796,25 +796,24 @@
 step s1-update-table-3: 
     UPDATE ref_table_3 SET id = 2 WHERE value = 1 RETURNING id;
 
 id
 --
  2
 (1 row)
 
 step s2-insert-table-2: 
     INSERT INTO ref_table_2 VALUES (7, 5);
- <waiting ...>
+
 step s1-commit: 
     COMMIT;
 
-step s2-insert-table-2: <... completed>
 step s2-commit: 
     COMMIT;
 
 step s1-select-table-3: 
     SELECT * FROM ref_table_3 ORDER BY id, value;
 
 id|value
 --+-----
  2|    1
  3|    3
diff -dU10 -w /home/jelte/work/citus/src/test/regress/expected/isolation_ref2ref_foreign_keys.out /home/jelte/work/citus/src/test/regress/results/isolation_ref2ref_foreign_keys.out
--- /home/jelte/work/citus/src/test/regress/expected/isolation_ref2ref_foreign_keys.out.modified	2022-08-18 15:15:26.809528180 +0200
+++ /home/jelte/work/citus/src/test/regress/results/isolation_ref2ref_foreign_keys.out.modified	2022-08-18 15:15:26.829528180 +0200
@@ -461,26 +461,25 @@
  BEGIN;
 
 step s2-begin: 
  BEGIN;
 
 step s2-delete-table-1: 
     DELETE FROM ref_table_1 WHERE id = 1;
 
 step s1-insert-table-2: 
     INSERT INTO ref_table_2 VALUES (7, 2);
- <waiting ...>
+
+ERROR:  insert or update on table "ref_table_2_114918" violates foreign key constraint "ref_table_2_value_fkey_114918"
 step s2-commit: 
     COMMIT;
 
-step s1-insert-table-2: <... completed>
-ERROR:  insert or update on table "ref_table_2_xxxxxxx" violates foreign key constraint "ref_table_2_value_fkey_xxxxxxx"
 step s1-commit: 
     COMMIT;
 
 step s1-select-table-2: 
     SELECT * FROM ref_table_2 ORDER BY id, value;
 
 id|value
 --+-----
  3|    3
  5|    5
diff -dU10 -w /home/jelte/work/citus/src/test/regress/expected/isolation_ref2ref_foreign_keys.out /home/jelte/work/citus/src/test/regress/results/isolation_ref2ref_foreign_keys.out
--- /home/jelte/work/citus/src/test/regress/expected/isolation_ref2ref_foreign_keys.out.modified	2022-08-18 15:18:24.609527853 +0200
+++ /home/jelte/work/citus/src/test/regress/results/isolation_ref2ref_foreign_keys.out.modified	2022-08-18 15:18:24.629527853 +0200
@@ -693,25 +693,24 @@
  BEGIN;
 
 step s2-begin: 
  BEGIN;
 
 step s2-insert-table-1: 
     INSERT INTO ref_table_1 VALUES (7, 7);
 
 step s1-update-table-2: 
     UPDATE ref_table_2 SET id = 0 WHERE value = 2;
- <waiting ...>
+
 step s2-commit: 
     COMMIT;
 
-step s1-update-table-2: <... completed>
 step s1-commit: 
     COMMIT;
 
 step s1-select-table-3: 
     SELECT * FROM ref_table_3 ORDER BY id, value;
 
 id|value
 --+-----
  1|    1
  3|    3
diff -dU10 -w /home/jelte/work/citus/src/test/regress/expected/isolation_ref2ref_foreign_keys.out /home/jelte/work/citus/src/test/regress/results/isolation_ref2ref_foreign_keys.out
--- /home/jelte/work/citus/src/test/regress/expected/isolation_ref2ref_foreign_keys.out.modified	2022-08-18 15:27:26.709526856 +0200
+++ /home/jelte/work/citus/src/test/regress/results/isolation_ref2ref_foreign_keys.out.modified	2022-08-18 15:27:26.739526856 +0200
@@ -368,38 +368,37 @@
  BEGIN;
 
 step s2-begin: 
  BEGIN;
 
 step s2-update-table-1: 
     UPDATE ref_table_1 SET id = 2 WHERE id = 1;
 
 step s1-insert-table-2: 
     INSERT INTO ref_table_2 VALUES (7, 2);
- <waiting ...>
+
+ERROR:  insert or update on table "ref_table_2_129660" violates foreign key constraint "ref_table_2_value_fkey_129660"
 step s2-commit: 
     COMMIT;
 
-step s1-insert-table-2: <... completed>
 step s1-commit: 
     COMMIT;
 
 step s1-select-table-2: 
     SELECT * FROM ref_table_2 ORDER BY id, value;
 
 id|value
 --+-----
  1|    2
  3|    3
  5|    5
- 7|    2
-(4 rows)
+(3 rows)
 
 
 starting permutation: s1-begin s2-begin s2-update-table-1 s1-update-table-2 s2-commit s1-commit s1-select-table-2
 step s1-begin: 
  BEGIN;
 
 step s2-begin: 
  BEGIN;
 
 step s2-update-table-1: 
diff -dU10 -w /home/jelte/work/citus/src/test/regress/expected/isolation_ref2ref_foreign_keys.out /home/jelte/work/citus/src/test/regress/results/isolation_ref2ref_foreign_keys.out
--- /home/jelte/work/citus/src/test/regress/expected/isolation_ref2ref_foreign_keys.out.modified	2022-08-18 15:32:25.139526307 +0200
+++ /home/jelte/work/citus/src/test/regress/results/isolation_ref2ref_foreign_keys.out.modified	2022-08-18 15:32:25.149526307 +0200
@@ -662,25 +662,24 @@
 step s1-update-table-3: 
     UPDATE ref_table_3 SET id = 2 WHERE value = 1 RETURNING id;
 
 id
 --
  2
 (1 row)
 
 step s2-insert-table-1: 
     INSERT INTO ref_table_1 VALUES (7, 7);
- <waiting ...>
+
 step s1-commit: 
     COMMIT;
 
-step s2-insert-table-1: <... completed>
 step s2-commit: 
     COMMIT;
 
 step s1-select-table-3: 
     SELECT * FROM ref_table_3 ORDER BY id, value;
 
 id|value
 --+-----
  2|    1
  3|    3
diff -dU10 -w /home/jelte/work/citus/src/test/regress/expected/isolation_ref2ref_foreign_keys.out /home/jelte/work/citus/src/test/regress/results/isolation_ref2ref_foreign_keys.out
--- /home/jelte/work/citus/src/test/regress/expected/isolation_ref2ref_foreign_keys.out.modified	2022-08-18 15:42:08.689525233 +0200
+++ /home/jelte/work/citus/src/test/regress/results/isolation_ref2ref_foreign_keys.out.modified	2022-08-18 15:42:08.729525233 +0200
@@ -288,21 +288,22 @@
 
 step s1-view-locks: 
     SELECT mode, count(*)
     FROM pg_locks
     WHERE locktype='advisory'
     GROUP BY mode
     ORDER BY 1, 2;
 
 mode                    |count
 ------------------------+-----
-(0 rows)
+ShareUpdateExclusiveLock|    1
+(1 row)
 
 
 starting permutation: s2-begin s2-insert-table-3 s1-view-locks s2-rollback s1-view-locks
 step s2-begin: 
  BEGIN;
 
 step s2-insert-table-3: 
     INSERT INTO ref_table_3 VALUES (7, 5);
 
 step s1-view-locks: 

JelteF avatar Aug 18 '22 13:08 JelteF

I found two other missing RowExclusiveLock locks with more information, because of #6197:

diff -dU10 -w /home/jelte/work/citus/src/test/regress/expected/isolation_ref2ref_foreign_keys.out /home/jelte/work/citus/src/test/regress/results/isolation_ref2ref_foreign_keys.out
--- /home/jelte/work/citus/src/test/regress/expected/isolation_ref2ref_foreign_keys.out.modified	2022-08-18 16:37:12.359519155 +0200
+++ /home/jelte/work/citus/src/test/regress/results/isolation_ref2ref_foreign_keys.out.modified	2022-08-18 16:37:12.389519155 +0200
@@ -450,25 +450,23 @@
         regexp_replace(query, E'[\\n\\r\\u2028]+', ' ', 'g' ) query
     FROM pg_locks l
     JOIN pg_stat_activity a
     ON l.pid = a.pid
     WHERE locktype='advisory'
         AND application_name <> 'Citus Maintenance Daemon'
     ORDER BY 1, 2, 3, 4;
 
 classid|  objid|objsubid|mode            |application_name                           |backend_type  |query                                       
 -------+-------+--------+----------------+-------------------------------------------+--------------+--------------------------------------------
-      0|8429800|       5|RowExclusiveLock|isolation/isolation_ref2ref_foreign_keys/s2|client backend|     INSERT INTO ref_table_3 VALUES (7, 5);
-      0|8429801|       5|RowExclusiveLock|isolation/isolation_ref2ref_foreign_keys/s2|client backend|     INSERT INTO ref_table_3 VALUES (7, 5);
       0|8429802|       4|ShareLock       |isolation/isolation_ref2ref_foreign_keys/s2|client backend|     INSERT INTO ref_table_3 VALUES (7, 5); 
       0|8429802|       5|RowExclusiveLock|isolation/isolation_ref2ref_foreign_keys/s2|client backend|     INSERT INTO ref_table_3 VALUES (7, 5); 
-(4 rows)
+(2 rows)
 
 step s2-rollback: 
     ROLLBACK;
 
 step s1-view-locks: 
     SELECT classid,
         objid,
         objsubid,
         mode,
         application_name,

It could be this is the same underlying cause as most of the other locking issues.

JelteF avatar Aug 18 '22 14:08 JelteF

@onderkalaci suggested that the reason for this is probably that we don't invalidate the foreign key graph correctly, like described in https://github.com/citusdata/citus/issues/4676

Relevant code: https://github.com/citusdata/citus/blob/3bf3d9c19b8631c6848f0c52865d1dad0e8693c4/src/backend/distributed/metadata/metadata_cache.c#L4120-L4148

JelteF avatar Aug 22 '22 07:08 JelteF