yugabyte-db
yugabyte-db copied to clipboard
[YSQL] FK sometimes fail on batch insert
Jira Link: DB-9854
Description
Hi guys, I think I found an issue while inserting multiple records. The insertion may throw a foreign key constraint violation.
Conditions to replicate:
- Batch insertion
- Some records need to FK to each other in the batch (Unique Index FK)
- Some other records need to FK to already inserted records (Unique Index FK)
- All records also need to FK to another table records
Example:
CREATE TABLE IF NOT EXISTS workplaces (
name TEXT NOT NULL,
CONSTRAINT workplaces_pk PRIMARY KEY (name)
);
CREATE TABLE IF NOT EXISTS employees (
workplace TEXT NOT NULL,
id UUID NOT NULL,
name TEXT NOT NULL,
manager_id UUID,
CONSTRAINT employees_pk PRIMARY KEY (id)
);
CREATE UNIQUE INDEX employees_workplace_and_id_idx ON employees (workplace, id);
ALTER TABLE employees ADD CONSTRAINT employees_managers_fk
FOREIGN KEY (workplace, manager_id)
REFERENCES employees (workplace, id);
ALTER TABLE employees ADD CONSTRAINT employees_workplaces_fk
FOREIGN KEY (workplace)
REFERENCES workplaces (name);
INSERT INTO workplaces (name) VALUES ('MyCorp');
INSERT INTO employees
(workplace, id, name, manager_id)
VALUES
('MyCorp', '00112233-4455-6677-8899-aaaaaaaaaaaa', 'Elad', NULL);
INSERT INTO employees
(workplace, id, name, manager_id)
VALUES
('MyCorp', '00112233-4455-6677-8899-bbbbbbbbbbbb', 'Aviel', '00112233-4455-6677-8899-aaaaaaaaaaaa'),
('MyCorp', '00112233-4455-6677-8899-cccccccccccc', 'Someone', '00112233-4455-6677-8899-bbbbbbbbbbbb'),
('MyCorp', '00112233-4455-6677-8899-dddddddddddd', 'MrBug', '00112233-4455-6677-8899-cccccccccccc');
INSERT INTO employees
(workplace, id, name, manager_id)
VALUES
('MyCorp', '00112233-ffff-ffff-ffff-bbbbbbbbbbbb', 'Aviel 2', '00112233-4455-6677-8899-aaaaaaaaaaaa'),
('MyCorp', '00112233-ffff-ffff-ffff-cccccccccccc', 'Someone 2', '00112233-ffff-ffff-ffff-bbbbbbbbbbbb'),
('MyCorp', '00112233-ffff-ffff-ffff-dddddddddddd', 'MrBug 2', '00112233-ffff-ffff-ffff-cccccccccccc');
Error invoking remote method 'DB_EXECUTE_CANCELLABLE_QUERY': error: insert or update on table "employees" violates foreign key constraint "employees_managers_fk"
On Postgres it always works.
Issue Type
kind/bug
Warning: Please confirm that this issue does not contain any sensitive information
- [X] I confirm this issue does not contain any sensitive information.
@eladshamai
- What version of yugabytedb are you using?
- What version of PostgreSQL?
- Did you set any custom configuration?
@eladshamai
- What version of yugabytedb are you using?
- What version of PostgreSQL?
- Did you set any custom configuration?
YugabyteDB 2.20.1.3 PostgreSQL 14.9 No custom configuration
@eladshamai
Is it possible to get more info on how to replicate it? Maybe a small script? I'm trying to replicate but cannot.
@eladshamai
Is it possible to get more info on how to replicate it? Maybe a small script? I'm trying to replicate but cannot.
Try the example I wrote below the description. If you find it hard to replicate, try executing those lines as fast as possible. My suspicion is that YugaByteDB does some Asyc stuff after committing a transaction. If you wait a while, the last batch in the example most likely to succeed. So try to run the last batch immediately.
Try the example I wrote below the description.
I did.
If you find it hard to replicate, try executing those lines as fast as possible.
Please explain how many sessions you're running and where you commit. It's hard to translate as is.
Please explain how many sessions you're running and where you commit. It's hard to translate as is.
Single Session.
Replicated on:
- NodeJS (node postgres)
- Sqlectron
- DBeaver
And if it doesn't fail? Do you just truncate and try again?
After hitting Retry couple of times:
And if it doesn't fail? Do you just truncate and try again?
It mostly fail, after the 1st try, but yes. I drop both of the tables and do it all over again.
Please write in pseudocode or something cause just showing a screenshot when the error happens doesn't help.
Please write in pseudocode or something cause just showing a screenshot when the error happens doesn't help.
There are 3 SQL sections in the example. 3 queries to replicate.
- Run the 1st query to build the Schema.
- Run the 2nd query to insert the first batch.
- Run the 3rd query to insert the second batch.
- If it fails (almost every try) 4.1. Then, hit retry couple of times until it succeed. 4.2. Otherwise, drop everything and start all over again.
@eladshamai
See this python script that doesn't error for me. Can you make it fail?:
def db_error_check():
import psycopg2
# Create the database connection.
connString = "host=127.0.1.1 port=5433 dbname=yugabyte user=yugabyte password=yugabyte load_balance=True"
conn = psycopg2.connect(connString)
# Open a cursor to perform database operations.
# The default mode for psycopg2 is "autocommit=false".
conn.set_session(autocommit=True)
cur = conn.cursor()
# Create the table. (It might preexist.)
cur.execute("""
drop table IF EXISTS employees,workplaces cascade;
""")
cur.execute("""
CREATE TABLE IF NOT EXISTS workplaces (
name TEXT NOT NULL,
CONSTRAINT workplaces_pk PRIMARY KEY (name)
);
CREATE TABLE IF NOT EXISTS employees (
workplace TEXT NOT NULL,
id UUID NOT NULL,
name TEXT NOT NULL,
manager_id UUID,
CONSTRAINT employees_pk PRIMARY KEY (id)
);
CREATE UNIQUE INDEX IF NOT EXISTS employees_workplace_and_id_idx ON employees (workplace, id);
ALTER TABLE employees ADD CONSTRAINT employees_managers_fk
FOREIGN KEY (workplace, manager_id)
REFERENCES employees (workplace, id);
ALTER TABLE employees ADD CONSTRAINT employees_workplaces_fk
FOREIGN KEY (workplace)
REFERENCES workplaces (name);"""
)
for i in range(50):
cur.execute("""
INSERT INTO workplaces (name) VALUES ('MyCorp');
INSERT INTO employees
(workplace, id, name, manager_id)
VALUES
('MyCorp', '00112233-4455-6677-8899-aaaaaaaaaaaa', 'Elad', NULL);
INSERT INTO employees
(workplace, id, name, manager_id)
VALUES
('MyCorp', '00112233-4455-6677-8899-bbbbbbbbbbbb', 'Aviel', '00112233-4455-6677-8899-aaaaaaaaaaaa'),
('MyCorp', '00112233-4455-6677-8899-cccccccccccc', 'Someone', '00112233-4455-6677-8899-bbbbbbbbbbbb'),
('MyCorp', '00112233-4455-6677-8899-dddddddddddd', 'MrBug', '00112233-4455-6677-8899-cccccccccccc');
""")
cur.execute("commit;")
cur.execute("TRUNCATE employees,workplaces CASCADE;")
exit("end")
db_error_check()
@ddorian The bug won't be replicated because you didn't wrote the second insert batch.
Also, turn off the auto-commit. It should be False on any application. (You will get the error anyway, but probably not on the first loop iteration).
Here is the modified script: Note that I've changed a bit the connection string to make that runnable on my machine.
def db_error_check():
import psycopg2
# Create the database connection.
connString = "host=127.0.0.1 port=5433 dbname=yugabyte user=yugabyte password=yugabyte"
conn = psycopg2.connect(connString)
# Open a cursor to perform database operations.
# The default mode for psycopg2 is "autocommit=false".
conn.set_session(autocommit=False)
cur = conn.cursor()
# Create the table. (It might preexist.)
cur.execute("""
drop table IF EXISTS employees,workplaces cascade;
""")
cur.execute("""
CREATE TABLE IF NOT EXISTS workplaces (
name TEXT NOT NULL,
CONSTRAINT workplaces_pk PRIMARY KEY (name)
);
CREATE TABLE IF NOT EXISTS employees (
workplace TEXT NOT NULL,
id UUID NOT NULL,
name TEXT NOT NULL,
manager_id UUID,
CONSTRAINT employees_pk PRIMARY KEY (id)
);
CREATE UNIQUE INDEX IF NOT EXISTS employees_workplace_and_id_idx ON employees (workplace, id);
ALTER TABLE employees ADD CONSTRAINT employees_managers_fk
FOREIGN KEY (workplace, manager_id)
REFERENCES employees (workplace, id);
ALTER TABLE employees ADD CONSTRAINT employees_workplaces_fk
FOREIGN KEY (workplace)
REFERENCES workplaces (name);"""
)
for i in range(50):
print(i)
cur.execute("""
INSERT INTO workplaces (name) VALUES ('MyCorp');
INSERT INTO employees
(workplace, id, name, manager_id)
VALUES
('MyCorp', '00112233-4455-6677-8899-aaaaaaaaaaaa', 'Elad', NULL);
INSERT INTO employees
(workplace, id, name, manager_id)
VALUES
('MyCorp', '00112233-4455-6677-8899-bbbbbbbbbbbb', 'Aviel', '00112233-4455-6677-8899-aaaaaaaaaaaa'),
('MyCorp', '00112233-4455-6677-8899-cccccccccccc', 'Someone', '00112233-4455-6677-8899-bbbbbbbbbbbb'),
('MyCorp', '00112233-4455-6677-8899-dddddddddddd', 'MrBug', '00112233-4455-6677-8899-cccccccccccc');
""")
cur.execute("commit;")
# Second Batch
cur.execute("""
INSERT INTO employees
(workplace, id, name, manager_id)
VALUES
('MyCorp', '00112233-ffff-ffff-ffff-bbbbbbbbbbbb', 'Aviel 2', '00112233-4455-6677-8899-aaaaaaaaaaaa'),
('MyCorp', '00112233-ffff-ffff-ffff-cccccccccccc', 'Someone 2', '00112233-ffff-ffff-ffff-bbbbbbbbbbbb'),
('MyCorp', '00112233-ffff-ffff-ffff-dddddddddddd', 'MrBug 2', '00112233-ffff-ffff-ffff-cccccccccccc');
""")
cur.execute("commit;")
cur.execute("TRUNCATE employees,workplaces CASCADE;")
exit("end")
db_error_check()
Smaller script:
def db_error_check():
import psycopg2
# Create the database connection.
# yugabyte
connString = "host=127.0.1.1 port=5433 dbname=yugabyte user=yugabyte password=yugabyte"
# postgresql
# connString = "host=127.0.0.1 port=5432 dbname=yugabyte user=postgres password=1"
conn = psycopg2.connect(connString)
# Open a cursor to perform database operations.
# The default mode for psycopg2 is "autocommit=false".
conn.set_session(autocommit=False)
cur = conn.cursor()
# Create the table. (It might preexist.)
cur.execute("""
drop table IF EXISTS employees,workplaces cascade;
""")
cur.execute("""
CREATE TABLE IF NOT EXISTS workplaces (
name TEXT NOT NULL,
CONSTRAINT workplaces_pk PRIMARY KEY (name)
);
CREATE TABLE IF NOT EXISTS employees (
workplace TEXT NOT NULL,
id UUID NOT NULL,
name TEXT NOT NULL,
manager_id UUID,
CONSTRAINT employees_pk PRIMARY KEY (id)
);
CREATE UNIQUE INDEX IF NOT EXISTS employees_workplace_and_id_idx ON employees (workplace, id);
ALTER TABLE employees ADD CONSTRAINT employees_managers_fk
FOREIGN KEY (workplace, manager_id)
REFERENCES employees (workplace, id);
ALTER TABLE employees ADD CONSTRAINT employees_workplaces_fk
FOREIGN KEY (workplace)
REFERENCES workplaces (name);"""
)
for i in range(1):
print(i)
cur.execute("""
INSERT INTO workplaces (name) VALUES ('MyCorp');
""")
cur.execute("commit;")
# cur.execute("begin transaction isolation level read committed;")
# Second Batch
cur.execute("""
INSERT INTO employees
(workplace, id, name, manager_id)
VALUES
('MyCorp', '00112233-ffff-ffff-ffff-bbbbbbbbbbbb', 'Aviel 2', NULL),
('MyCorp', '00112233-ffff-ffff-ffff-cccccccccccc', 'Someone 2', '00112233-ffff-ffff-ffff-bbbbbbbbbbbb'),
('MyCorp', '00112233-ffff-ffff-ffff-dddddddddddd', 'MrBug 2', '00112233-ffff-ffff-ffff-cccccccccccc');
""")
cur.execute("commit;")
cur.execute("TRUNCATE employees,workplaces CASCADE;")
exit("end")
db_error_check()
And output:
File ysql.py", line 88, in db_error_check
cur.execute("""
psycopg2.errors.ForeignKeyViolation: insert or update on table "employees" violates foreign key constraint "employees_managers_fk"
DETAIL: Key (workplace, manager_id)=(MyCorp, 00112233-ffff-ffff-ffff-cccccccccccc) is not present in table "employees".
@ddorian Exactly. UUID 00112233-ffff-ffff-ffff-cccccccccccc should be present in table "employees", because it belongs to 'Someone 2' (the previous one). On Postgres there are no errors.
@eladshamai the bug has been identified and is being worked on.
Thx a lot @eladshamai for reporting this issue with a nice/simple test case! Nice find.
Thx a lot @eladshamai for reporting this issue with a nice/simple test case! Nice find.
Of course! Thank you for making a great distributed RDBMS to the open source community.