citus
citus copied to clipboard
[BUG]alter_old_partitions_set_access_method deadlock
testdb=# CALL alter_old_partitions_set_access_method(
't_test_base_info', '2023-09-01 00:00:00',
'columnar'
);
NOTICE: converting t_test_base_info_p2023_08_30 with start time 2023-08-30 00:00:00 and end time 2023-08-31 00:00:00
NOTICE: creating a new table for public.t_test_base_info_p2023_08_30
NOTICE: moving the data of public.t_test_base_info_p2023_08_30
ERROR: canceling the transaction since it was involved in a distributed deadlock
CONTEXT: SQL statement "ALTER TABLE IF EXISTS public.t_test_base_info DETACH PARTITION public.t_test_base_info_p2023_08_30;"
SQL statement "SELECT pg_catalog.alter_table_set_access_method(r.partition, new_access_method)"
PL/pgSQL function alter_old_partitions_set_access_method(regclass,timestamp with time zone,name) line 27 at PERFORM
https://github.com/citusdata/citus/issues/6609 like this
citus 11.3 postgres 15.2
I have encountered a very similar problem on a single-node instance of Citus. I attempted to convert a heap storage partition to columnar storage with the alter_table_set_access()
command while my application continued to insert data into the same partitioned table concurrently. As far as I can tell, the application was inserting into a different partition than the one I was attempting to convert. Once the operation began, the application was blocked from continuing to insert data into the table. After 10+ minutes of high disk I/O, the result of the operation was the following deadlock error:
ERROR: deadlock detected
DETAIL: Process 949404 waits for AccessExclusiveLock on relation 127868 of database 16384; blocked by process 896852.
Process 896582 waits for RowExclusiveLock on relation 151949 of database 16384; blocked by process 949404.
HINT: See server log for query details.
CONTEXT: SQL statement "ALTER TABLE IF EXISTS records DETACH PARTITION records_p2023_11_29;"
I shut down the application so that no concurrent insert operations were occurring, and re-attempted the conversion to columnar storage. This time, the operation was successful.
I read the PostgreSQL documentation for the DETACH PARTITION command and it shows that there is a CONCURRENTLY
parameter. Could Citus potentially use this parameter to ensure that conversion to columnar storage can happen concurrently with other ongoing database operations?
I am running Citus 12.0.0.