db-sync
db-sync copied to clipboard
Order of table inserts incorrect (foreign keys)
OK so I have a database with some PK -> FK constraints and my workflow is:
- Load the db schema in postgres
- Load some fixtures in postgres
- Init from DB
- Push to mergin cloud
- Pull to QGIS Desktop (using a separate working dir to the one db sync uses)
- Make some edits (add a plant type, add a vegetation point) in QGIS
- Sync from QGIS Desktop to mergin cloud
- Use mergin-db-sync to try to pull the changes back from cloud to my pg database
This process fails because it tries to insert the vegetation point record before it has created the needed plant type record.
Here is the log of my process:
# Just a clean up in case we had the db already
# NOTE: Also remove the GPKG from the mergin cloud instance and commit the change
PGSERVICE=osgs.kartoza.com psql -c "drop database farming;" postgres
# Grab the schema
wget -O farming.sql https://raw.githubusercontent.com/kartoza/smallfarming-gis/main/sql/farming.sql
# Load the schema
PGSERVICE=osgs.kartoza.com psql -c "create database farming;" postgres
PGSERVICE=osgs.kartoza.com psql -f farming.sql farming
# Grab the fixtures
wget -O fixtures.sql https://raw.githubusercontent.com/kartoza/smallfarming-gis/main/sql/fixtures.sql
# Load the fixtures
PGSERVICE=osgs.kartoza.com psql -f fixtures.sql farming
# Verify db state (list schemas, list tables)
PGSERVICE=osgs.kartoza.com psql -c "\dn" farming
PGSERVICE=osgs.kartoza.com psql -c "\d" farming
# This is mergin-db-sync's folder, it will get recreated in the line after
rm -rf /home/timlinux/gisdata/MerginSyncProjects/farming/
python dbsync.py init-from-db
rm -rf ~/Syncthing/QGISProjects/MerginMapsProjects/farming
# Now do steps 5,6,7 in my QGIS Desktop, checking out the project to the folder above
python dbsync.py pull
Here is the log from the last (step 8 above) pull command:
timlinux crest ../mergin-db-sync master □ python dbsync.py pull
== Starting Mergin Maps DB Sync version 1.1.2 ==
Logging in to Mergin Maps...
Pulling...
Processing Mergin Maps project 'timlinux/farming'
DB Changes:
electricity_line_condition_type 0 2 0
electricity_line_type 0 5 0
month 0 12 0
plant_growth_activity_type 0 6 0
plant_usage 0 3 0
water_line_type 0 2 0
water_point_type 0 4 0
water_polygon_type 0 4 0
water_source 0 3 0
Pulled new version from Mergin Maps: v7
Mergin Maps Changes:
plant_type 1 0 0
vegetation_point 2 0 0
Applying new version [WITH rebase]
Debug: rebase info (base2their / old)
TABLE plant_type
inserted 1,
deleted --none --
updated --none --
TABLE vegetation_point
inserted 1,2,
deleted --none --
updated --none --
Debug: mapping
--none --
Debug: No conflicts present
Error: rebase-db failed!
GEODIFF: Error: postgres cmd error: ERROR: insert or update on table "vegetation_point" violates foreign key constraint "vegetation_point_plant_type_uuid_fkey"
DETAIL: Key (plant_type_uuid)=(0b25842b-ec57-479b-993f-5a64b0ba6347) is not present in table "plant_type".
SQL:
INSERT INTO "public"."vegetation_point" ("id", "uuid", "last_update", "last_update_by", "notes", "image", "estimated_crown_radius_m", "estimated_planting_year", "estimated_height_m", "geometry", "plant_type_uuid") VALUES (2, 'fce59771-f33a-4311-ae2a-35c7cc1e4224', '2023-03-28T23:16:07.248', '[email protected]', NULL, NULL, NULL, NULL, NULL, ST_GeomFromWKB('\x0101000000EC68098D0ACF1DC0396D2117CBB44340', 4326), '0b25842b-ec57-479b-993f-5a64b0ba6347')
Error: Unable to perform GEODIFF_applyChangeset modified2final
Error: geodiff failed!
['geodiff', 'rebase-db', '--driver', 'postgres', 'host=osgs.kartoza.com dbname=farming user=docker password=xxxxx sslmode=require', '--skip-tables', 'spatial_ref_sys', 'mergin_base_do_not_touch', 'public', '/tmp/farming-dbsync-pull-base2their', '/tmp/farming-dbsync-pull-conflicts']
Ordering of table inserts/updates/deletes to avoid this would be probably very tricky, fortunately there is an alternative solution: constraints such as foreign keys can be deferred, so they are not evaluated immediately after each insert/update/delete, but only at the end of the transaction.
There are three valid options for constraints:
NOT DEFERRABLE- if it can not be deferred at all (the default)DEFERRABLE INITIALLY IMMEDIATE- it can be deferred, but it has to be explicitly enabled in transactionDEFERRABLE INITIALLY DEFERRED- it can be deferred, and it is set as deferred initially in transactions
So the SQL creating tables can be modified from this:
infrastructure_type_uuid UUID NOT NULL REFERENCES infrastructure_type(uuid)
to something like this:
infrastructure_type_uuid UUID NOT NULL REFERENCES infrastructure_type(uuid) DEFERRABLE INITIALLY DEFERRED
Then the problem should go away. Foreign key checks are still enabled, but only evaluated at the end of transaction when everything should be in order.
Things worth improving in geodiff / db-sync on this matter:
- detect if there are any foreign keys that are not deferrable
- support also
DEFERRABLE INITIALLY IMMEDIATEby runningSET CONSTRAINTS ALL DEFERRED;at the start of a transaction
Useful reference: https://emmer.dev/blog/deferrable-constraints-in-postgresql/
Ah this is super helpful advice, thanks @wonder-sk !