pgcopydb icon indicating copy to clipboard operation
pgcopydb copied to clipboard

Fix invalid UPDATE stmt when none of the columns change

Open arajkumar opened this issue 1 year ago • 4 comments

When all column values in the SET clause are equal to those in the WHERE clause, we remove all columns from the SET clause. This results in an invalid UPDATE statement like the one shown below:

UPDATE table SET WHERE "id" = 1;

Usually, the above could happen when REPLICA IDENTITY is set to FULL, and the UPDATE statement executed with the same values as the old ones. For e.g.

UPDATE table SET "id" = 1 WHERE "id" = 1;

Solution: Skip the UPDATE when all column values in SET clause are equal to the WHERE clause values.

arajkumar avatar Sep 18 '24 14:09 arajkumar

Hmmm, I think the alternative approach is the right one.

arajkumar avatar Sep 18 '24 14:09 arajkumar

Do we need to worry about triggers on the target side? UPDATE statement that look like they change nothing can be used to execute a trigger... on the replica system though, this would have to be a specifically crafted trigger that still is executed with session_replication_role set to replica.

dimitri avatar Sep 18 '24 14:09 dimitri

@dimitri, Should we really worry about replica triggers? Other option would be just eliminating the IDENTITY column and revert the duplication removal, as we already have the generated column infra.

arajkumar avatar Sep 18 '24 15:09 arajkumar

@dimitri In this PR, I'm skipping the UPDATE stmt when all columns in SET clause are duplicates of WHERE clause values.

Once we fix https://github.com/dimitri/pgcopydb/issues/844, we shall also remove the duplicate removal logic.

arajkumar avatar Sep 18 '24 17:09 arajkumar

@dimitri Thanks for reviewing this PR.

I think we should add some mention of what we're doing here in the docs.

I don't think we should add to the doc, This is just a big fix to avoid generation of invalid SQL UPDATE statement. Also, I believe we should just implement skipping identity columns being included as part of UPDATE's SET clause(https://github.com/dimitri/pgcopydb/issues/844) and remove the deduplication logic all together. WDYT?

arajkumar avatar Oct 04 '24 03:10 arajkumar

I don't think we should add to the doc, This is just a big fix to avoid generation of invalid SQL UPDATE statement.

As a change of behavior, I think you're right, no need to document the fix. That said, I think our overall approach should be documented. Saying that makes me realize it's not fair to add that burden to your PR (now merged), I will think more about what I think is needed and take care of it.

Also, I believe we should just implement skipping identity columns being included as part of UPDATE's SET clause(#844) and remove the deduplication logic all together. WDYT?

First part, agreed. Second part (remove the deduplication logic), I'm not sure I understand it. Can you provide a couple examples showing what we do, why it's wrong, and what you propose we should do instead?

dimitri avatar Oct 04 '24 10:10 dimitri