citus icon indicating copy to clipboard operation
citus copied to clipboard

Intra-node isolation anomaly

Open michailtoksovo opened this issue 1 year ago • 2 comments

Simple ACID-test (sum calculation under pgbench), PostgreSQL 15.1/Citus 12.1

  1. Prepare data:
CREATE TABLE table_a 
( id bigint not null
, shard_n int not null
, value int not null default 0
, primary key(shard_n, id)
);

SELECT create_distributed_table('table_a', 'shard_n', shard_count := 2);

-- even to shard #1, odd to shard #2
INSERT INTO table_a (id, shard_n)
SELECT i, 2 - i % 2 FROM generate_series(1, 1024) c(i);

SELECT shard_name, nodename, nodeport, citus_table_type FROM citus_shards;
--    shard_name   | nodename  | nodeport | citus_table_type 
-- ----------------+-----------+----------+------------------
--  table_a_102008 | 127.0.0.1 |     5433 | distributed
--  table_a_102009 | 127.0.0.1 |     5433 | distributed
  1. Run pgbench:
pgbench -c 4 -T 100 -h 127.0.0.1 -p 5433 -f update.sql

update.sql:

\set id random(1, 512)
BEGIN;
-- update #1: random row from table_a_102008 (node 1)
UPDATE table_a SET value = value + 1 WHERE id = (2 * :id - 1) and shard_n = 1;
-- update #2: random row from table_a_102009 (node 1)
UPDATE table_a SET value = value - 1 WHERE id = 2 * :id and shard_n = 2;
COMMIT;
  1. Check sum under pgbench (should always be 0) - always failed
SELECT sum(value) FROM table_a; 

michailtoksovo avatar Nov 28 '23 07:11 michailtoksovo

pgbench -c 4 -T 100 -h 127.0.0.1 -p 5433 -f update.sql

I think I know the problem here. You distribute the table against the column 'shard_n'. (row goes to node1 if it is 1, to node2 if it is 2)

Assumption is that we increment a row and decrement another in each transaction, but this is wrong. id check in WHERE clause could cause no row is updated. Not all ids (random generated) exist in each node.

To fix the problem, you can change the insert query like below (to make sure all id in range (1,1024) exists in both shards):

INSERT INTO table_a (id, shard_n)
SELECT i, 1 FROM generate_series(1, 1024) c(i);

INSERT INTO table_a (id, shard_n)
SELECT i, 2 FROM generate_series(1, 1024) c(i);

aykut-bozkurt avatar Nov 30 '23 08:11 aykut-bozkurt

The problem is lack of distributed snapshot isolation. A basic solution is proposed in https://github.com/citusdata/citus/pull/6489

marcoslot avatar Nov 30 '23 09:11 marcoslot