yugabyte-db icon indicating copy to clipboard operation
yugabyte-db copied to clipboard

[YSQL] FK sometimes fail on batch insert

Open eladshamai opened this issue 1 year ago • 17 comments

Jira Link: DB-9854

Description

Hi guys, I think I found an issue while inserting multiple records. The insertion may throw a foreign key constraint violation.

Conditions to replicate:

  1. Batch insertion
  2. Some records need to FK to each other in the batch (Unique Index FK)
  3. Some other records need to FK to already inserted records (Unique Index FK)
  4. All records also need to FK to another table records

Example:

CREATE TABLE IF NOT EXISTS workplaces (
    name TEXT NOT NULL,
    CONSTRAINT workplaces_pk PRIMARY KEY (name)
);

CREATE TABLE IF NOT EXISTS employees (
    workplace TEXT NOT NULL,
    id UUID NOT NULL,
    name TEXT NOT NULL,
    manager_id UUID,
    CONSTRAINT employees_pk PRIMARY KEY (id)
);

CREATE UNIQUE INDEX employees_workplace_and_id_idx ON employees (workplace, id);

ALTER TABLE employees ADD CONSTRAINT employees_managers_fk
    FOREIGN KEY (workplace, manager_id)
    REFERENCES employees (workplace, id);

ALTER TABLE employees ADD CONSTRAINT employees_workplaces_fk
    FOREIGN KEY (workplace)
    REFERENCES workplaces (name);
INSERT INTO workplaces (name) VALUES ('MyCorp');

INSERT INTO employees
    (workplace, id, name, manager_id)
    VALUES
        ('MyCorp', '00112233-4455-6677-8899-aaaaaaaaaaaa', 'Elad', NULL);

INSERT INTO employees
    (workplace, id, name, manager_id)
    VALUES
        ('MyCorp', '00112233-4455-6677-8899-bbbbbbbbbbbb', 'Aviel', '00112233-4455-6677-8899-aaaaaaaaaaaa'),
        ('MyCorp', '00112233-4455-6677-8899-cccccccccccc', 'Someone', '00112233-4455-6677-8899-bbbbbbbbbbbb'),
        ('MyCorp', '00112233-4455-6677-8899-dddddddddddd', 'MrBug', '00112233-4455-6677-8899-cccccccccccc');
INSERT INTO employees
    (workplace, id, name, manager_id)
    VALUES
        ('MyCorp', '00112233-ffff-ffff-ffff-bbbbbbbbbbbb', 'Aviel 2', '00112233-4455-6677-8899-aaaaaaaaaaaa'),
        ('MyCorp', '00112233-ffff-ffff-ffff-cccccccccccc', 'Someone 2', '00112233-ffff-ffff-ffff-bbbbbbbbbbbb'),
        ('MyCorp', '00112233-ffff-ffff-ffff-dddddddddddd', 'MrBug 2', '00112233-ffff-ffff-ffff-cccccccccccc');
Error invoking remote method 'DB_EXECUTE_CANCELLABLE_QUERY': error: insert or update on table "employees" violates foreign key constraint "employees_managers_fk"

On Postgres it always works.

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

  • [X] I confirm this issue does not contain any sensitive information.

eladshamai avatar Jan 31 '24 03:01 eladshamai

@eladshamai

  • What version of yugabytedb are you using?
  • What version of PostgreSQL?
  • Did you set any custom configuration?

ddorian avatar Jan 31 '24 09:01 ddorian

@eladshamai

  • What version of yugabytedb are you using?
  • What version of PostgreSQL?
  • Did you set any custom configuration?

YugabyteDB 2.20.1.3 PostgreSQL 14.9 No custom configuration

eladshamai avatar Jan 31 '24 13:01 eladshamai

@eladshamai

Is it possible to get more info on how to replicate it? Maybe a small script? I'm trying to replicate but cannot.

ddorian avatar Feb 05 '24 11:02 ddorian

@eladshamai

Is it possible to get more info on how to replicate it? Maybe a small script? I'm trying to replicate but cannot.

Try the example I wrote below the description. If you find it hard to replicate, try executing those lines as fast as possible. My suspicion is that YugaByteDB does some Asyc stuff after committing a transaction. If you wait a while, the last batch in the example most likely to succeed. So try to run the last batch immediately.

eladshamai avatar Feb 12 '24 12:02 eladshamai

Try the example I wrote below the description.

I did.

If you find it hard to replicate, try executing those lines as fast as possible.

Please explain how many sessions you're running and where you commit. It's hard to translate as is.

ddorian avatar Feb 12 '24 12:02 ddorian

Please explain how many sessions you're running and where you commit. It's hard to translate as is.

Single Session.

Replicated on:

  1. NodeJS (node postgres)
  2. Sqlectron
  3. DBeaver

Image 12-02-2024 at 14 38

eladshamai avatar Feb 12 '24 12:02 eladshamai

And if it doesn't fail? Do you just truncate and try again?

ddorian avatar Feb 12 '24 12:02 ddorian

After hitting Retry couple of times:

Image 12-02-2024 at 14 42 (1)

eladshamai avatar Feb 12 '24 12:02 eladshamai

And if it doesn't fail? Do you just truncate and try again?

It mostly fail, after the 1st try, but yes. I drop both of the tables and do it all over again.

eladshamai avatar Feb 12 '24 12:02 eladshamai

Please write in pseudocode or something cause just showing a screenshot when the error happens doesn't help.

ddorian avatar Feb 12 '24 12:02 ddorian

Please write in pseudocode or something cause just showing a screenshot when the error happens doesn't help.

There are 3 SQL sections in the example. 3 queries to replicate.

  1. Run the 1st query to build the Schema.
  2. Run the 2nd query to insert the first batch.
  3. Run the 3rd query to insert the second batch.
  4. If it fails (almost every try) 4.1. Then, hit retry couple of times until it succeed. 4.2. Otherwise, drop everything and start all over again.

eladshamai avatar Feb 12 '24 13:02 eladshamai

@eladshamai

See this python script that doesn't error for me. Can you make it fail?:

def db_error_check():
    import psycopg2

    # Create the database connection.

    connString = "host=127.0.1.1 port=5433 dbname=yugabyte user=yugabyte password=yugabyte load_balance=True"

    conn = psycopg2.connect(connString)

    # Open a cursor to perform database operations.
    # The default mode for psycopg2 is "autocommit=false".

    conn.set_session(autocommit=True)
    cur = conn.cursor()

    # Create the table. (It might preexist.)

    cur.execute("""
    drop table IF EXISTS employees,workplaces cascade;
    """)

    cur.execute("""
            CREATE TABLE IF NOT EXISTS workplaces (
        name TEXT NOT NULL,
        CONSTRAINT workplaces_pk PRIMARY KEY (name)
    );

    CREATE TABLE IF NOT EXISTS employees (
        workplace TEXT NOT NULL,
        id UUID NOT NULL,
        name TEXT NOT NULL,
        manager_id UUID,
        CONSTRAINT employees_pk PRIMARY KEY (id)
    );

    CREATE UNIQUE INDEX IF NOT EXISTS employees_workplace_and_id_idx ON employees (workplace, id);

    ALTER TABLE employees ADD CONSTRAINT employees_managers_fk
        FOREIGN KEY (workplace, manager_id)
        REFERENCES employees (workplace, id);

    ALTER TABLE employees ADD CONSTRAINT employees_workplaces_fk
        FOREIGN KEY (workplace)
        REFERENCES workplaces (name);"""
                )
    for i in range(50):

        cur.execute("""
        INSERT INTO workplaces (name) VALUES ('MyCorp');
        
        INSERT INTO employees
            (workplace, id, name, manager_id)
            VALUES
                ('MyCorp', '00112233-4455-6677-8899-aaaaaaaaaaaa', 'Elad', NULL);
        
        INSERT INTO employees
            (workplace, id, name, manager_id)
            VALUES
                ('MyCorp', '00112233-4455-6677-8899-bbbbbbbbbbbb', 'Aviel', '00112233-4455-6677-8899-aaaaaaaaaaaa'),
                ('MyCorp', '00112233-4455-6677-8899-cccccccccccc', 'Someone', '00112233-4455-6677-8899-bbbbbbbbbbbb'),
                ('MyCorp', '00112233-4455-6677-8899-dddddddddddd', 'MrBug', '00112233-4455-6677-8899-cccccccccccc');        
        """)

        cur.execute("commit;")
        cur.execute("TRUNCATE employees,workplaces CASCADE;")
    exit("end")


db_error_check()

ddorian avatar Feb 12 '24 13:02 ddorian

@ddorian The bug won't be replicated because you didn't wrote the second insert batch.

Also, turn off the auto-commit. It should be False on any application. (You will get the error anyway, but probably not on the first loop iteration).

Here is the modified script: Note that I've changed a bit the connection string to make that runnable on my machine.

def db_error_check():
    import psycopg2

    # Create the database connection.

    connString = "host=127.0.0.1 port=5433 dbname=yugabyte user=yugabyte password=yugabyte"

    conn = psycopg2.connect(connString)

    # Open a cursor to perform database operations.
    # The default mode for psycopg2 is "autocommit=false".

    conn.set_session(autocommit=False)
    cur = conn.cursor()

    # Create the table. (It might preexist.)

    cur.execute("""
    drop table IF EXISTS employees,workplaces cascade;
    """)

    cur.execute("""
            CREATE TABLE IF NOT EXISTS workplaces (
        name TEXT NOT NULL,
        CONSTRAINT workplaces_pk PRIMARY KEY (name)
    );

    CREATE TABLE IF NOT EXISTS employees (
        workplace TEXT NOT NULL,
        id UUID NOT NULL,
        name TEXT NOT NULL,
        manager_id UUID,
        CONSTRAINT employees_pk PRIMARY KEY (id)
    );

    CREATE UNIQUE INDEX IF NOT EXISTS employees_workplace_and_id_idx ON employees (workplace, id);

    ALTER TABLE employees ADD CONSTRAINT employees_managers_fk
        FOREIGN KEY (workplace, manager_id)
        REFERENCES employees (workplace, id);

    ALTER TABLE employees ADD CONSTRAINT employees_workplaces_fk
        FOREIGN KEY (workplace)
        REFERENCES workplaces (name);"""
                )
    for i in range(50):
        print(i)
        cur.execute("""
        INSERT INTO workplaces (name) VALUES ('MyCorp');
        
        INSERT INTO employees
            (workplace, id, name, manager_id)
            VALUES
                ('MyCorp', '00112233-4455-6677-8899-aaaaaaaaaaaa', 'Elad', NULL);
        
        INSERT INTO employees
            (workplace, id, name, manager_id)
            VALUES
                ('MyCorp', '00112233-4455-6677-8899-bbbbbbbbbbbb', 'Aviel', '00112233-4455-6677-8899-aaaaaaaaaaaa'),
                ('MyCorp', '00112233-4455-6677-8899-cccccccccccc', 'Someone', '00112233-4455-6677-8899-bbbbbbbbbbbb'),
                ('MyCorp', '00112233-4455-6677-8899-dddddddddddd', 'MrBug', '00112233-4455-6677-8899-cccccccccccc');        
        """)

        cur.execute("commit;")

        # Second Batch
        cur.execute("""
            INSERT INTO employees
            (workplace, id, name, manager_id)
            VALUES
                ('MyCorp', '00112233-ffff-ffff-ffff-bbbbbbbbbbbb', 'Aviel 2', '00112233-4455-6677-8899-aaaaaaaaaaaa'),
                ('MyCorp', '00112233-ffff-ffff-ffff-cccccccccccc', 'Someone 2', '00112233-ffff-ffff-ffff-bbbbbbbbbbbb'),
                ('MyCorp', '00112233-ffff-ffff-ffff-dddddddddddd', 'MrBug 2', '00112233-ffff-ffff-ffff-cccccccccccc');
        """)
        cur.execute("commit;")

        cur.execute("TRUNCATE employees,workplaces CASCADE;")
    exit("end")


db_error_check()

eladshamai avatar Feb 12 '24 15:02 eladshamai

Smaller script:

def db_error_check():
    import psycopg2

    # Create the database connection.
    # yugabyte
    connString = "host=127.0.1.1 port=5433 dbname=yugabyte user=yugabyte password=yugabyte"
    # postgresql
    # connString = "host=127.0.0.1 port=5432 dbname=yugabyte user=postgres password=1"

    conn = psycopg2.connect(connString)

    # Open a cursor to perform database operations.
    # The default mode for psycopg2 is "autocommit=false".

    conn.set_session(autocommit=False)
    cur = conn.cursor()

    # Create the table. (It might preexist.)

    cur.execute("""
    drop table IF EXISTS employees,workplaces cascade;
    """)

    cur.execute("""
    CREATE TABLE IF NOT EXISTS workplaces (
        name TEXT NOT NULL,
        CONSTRAINT workplaces_pk PRIMARY KEY (name)
    );

    CREATE TABLE IF NOT EXISTS employees (
        workplace TEXT NOT NULL,
        id UUID NOT NULL,
        name TEXT NOT NULL,
        manager_id UUID,
        CONSTRAINT employees_pk PRIMARY KEY (id)
    );

    CREATE UNIQUE INDEX IF NOT EXISTS employees_workplace_and_id_idx ON employees (workplace, id);

    ALTER TABLE employees ADD CONSTRAINT employees_managers_fk
        FOREIGN KEY (workplace, manager_id)
        REFERENCES employees (workplace, id);

    ALTER TABLE employees ADD CONSTRAINT employees_workplaces_fk
        FOREIGN KEY (workplace)
        REFERENCES workplaces (name);"""
                )

    for i in range(1):
        print(i)
        cur.execute("""
        INSERT INTO workplaces (name) VALUES ('MyCorp');
        """)

        cur.execute("commit;")
        # cur.execute("begin transaction isolation level read committed;")
        # Second Batch
        cur.execute("""
            INSERT INTO employees
            (workplace, id, name, manager_id)
            VALUES
                ('MyCorp', '00112233-ffff-ffff-ffff-bbbbbbbbbbbb', 'Aviel 2', NULL),
                ('MyCorp', '00112233-ffff-ffff-ffff-cccccccccccc', 'Someone 2', '00112233-ffff-ffff-ffff-bbbbbbbbbbbb'),
                ('MyCorp', '00112233-ffff-ffff-ffff-dddddddddddd', 'MrBug 2', '00112233-ffff-ffff-ffff-cccccccccccc');
        """)
        cur.execute("commit;")

        cur.execute("TRUNCATE employees,workplaces CASCADE;")
    exit("end")


db_error_check()

And output:

  File ysql.py", line 88, in db_error_check
    cur.execute("""
psycopg2.errors.ForeignKeyViolation: insert or update on table "employees" violates foreign key constraint "employees_managers_fk"
DETAIL:  Key (workplace, manager_id)=(MyCorp, 00112233-ffff-ffff-ffff-cccccccccccc) is not present in table "employees".

ddorian avatar Feb 12 '24 16:02 ddorian

@ddorian Exactly. UUID 00112233-ffff-ffff-ffff-cccccccccccc should be present in table "employees", because it belongs to 'Someone 2' (the previous one). On Postgres there are no errors.

eladshamai avatar Feb 12 '24 16:02 eladshamai

@eladshamai the bug has been identified and is being worked on.

ddorian avatar Feb 27 '24 07:02 ddorian

Thx a lot @eladshamai for reporting this issue with a nice/simple test case! Nice find.

kmuthukk avatar Feb 27 '24 17:02 kmuthukk

Thx a lot @eladshamai for reporting this issue with a nice/simple test case! Nice find.

Of course! Thank you for making a great distributed RDBMS to the open source community.

eladshamai avatar Mar 01 '24 02:03 eladshamai