pglogical
pglogical copied to clipboard
Sync is working with new data, but no existing data is coming over.
Setup:
- Provider: PG 9.4 with pglogical v2.2.2
- Subscriber: PG 12 with pglogical 2.3.0
Expected Behavior: Existing data would make it's way over from the 9.4 db to the 12 db.
Actual Behavior: New inserts on the 9.4 db make it over but no existing data is transferred, or is being transfered as far as I can tell.
Instead of adding all tables, I dropped everything and only added a couple of smaller ones (1k rows rather than 1M) to test with. I can see updates trying to happen, and inserts do happen, but I cannot for the life of me get existing data transferred over.
When trying to force sync using alter_subscription_resynchronize_table('subscription', 'table_name
on the subscriber, I see the following in the provider's logs:
ERROR: duplicate key value violates unique constraint "table_name_pkey"
DETAIL: Key (id)=(89) already exists.
CONTEXT: COPY table_name, line 1
STATEMENT: COPY "public"."table_name" ("X","X","X,"...") FROM stdin
The subscriber's tables are empty and the provider doesn't have a dupe. I'm not sure whats going on here. I assume sequencing is off or something?
Got the same issue, Publisher PG 9.4.26 (pglogical 2.3.1) -> Sub PG 12.3 (pglogical 2.3.1)
<2020-06-03 07:03:17 UTC-[unknown]--sfdb [app:pglogical apply 16453:1559963640,pid:37201,23000]>LOG: CONFLICT: remote UPDATE on relation schema_name_1.tab_name_1 replica identity index pk_tab_name_1 (tuple not found). Resolution: skip.
<2020-06-03 07:03:17 UTC-[unknown]--sfdb [app:pglogical apply 16453:1559963640,pid:37201,23000]>DETAIL: remote tuple {<data here>} in xact origin=1,timestamp=2020-06-03 07:03:17.563272+00,commit_lsn=0/B2D263B0
<2020-06-03 07:03:17 UTC-[unknown]--sfdb [app:pglogical apply 16453:1559963640,pid:37201,23000]>CONTEXT: apply UPDATE from remote relation schema_name_1.tab_name_1 in commit before 2DDF/B2D263B0, xid 39550428 committed at 2020-06-03 07:03:17.563272+00 (action #93) from node replorigin 1
<2020-06-03 07:03:17 UTC-[unknown]--sfdb [app:pglogical apply 16453:1559963640,pid:37201,23505]>ERROR: duplicate key value violates unique constraint "pk_tab_name_1"
<2020-06-03 07:03:17 UTC-[unknown]--sfdb [app:pglogical apply 16453:1559963640,pid:37201,23505]>DETAIL: Key (<fields here>)=(<data here>) already exists.
<2020-06-03 07:03:17 UTC-[unknown]--sfdb [app:pglogical apply 16453:1559963640,pid:37201,23505]>CONTEXT: apply multi INSERT from remote relation schema_name_1.tab_name_1 in commit before 2DDF/B2D263B0, xid 39550428 committed at 2020-06-03 07:03:17.563272+00 (action #415) from node replorigin 1
<2020-06-03 07:03:17 UTC-[unknown]--sfdb [app:pglogical apply 16453:1559963640,pid:37201,00000]>LOG: apply worker [37201] at slot 2 generation 2 exiting with error
<2020-06-03 07:03:17 UTC--- [app:,pid:27881,00000]>LOG: background worker "pglogical apply 16453:1559963640" (PID 37201) exited with exit code 1
the error in turn cause another issue with PG 12 replica:
<2020-06-03 06:57:51 UTC--- [app:,pid:26390,00000]>LOG: restartpoint starting: wal
<2020-06-03 07:03:17 UTC--- [app:,pid:26389,XX000]>PANIC: invalid max offset number
<2020-06-03 07:03:17 UTC--- [app:,pid:26389,XX000]>CONTEXT: WAL redo at 51/C3CE8B88 for Heap2/MULTI_INSERT: 6 tuples flags 0x08
Is it possible that PG 12 WAL was corrupted by pglogical ? Is it safe at all to use pglogical in Pub 9.4.26 -> Sub PG 12.3 model?
Looks like related to: https://github.com/2ndQuadrant/pglogical/pull/295
provider 10.4, pglogical 2.4.1 subscriber 10.2, pglogical 2.4.0
seeing the same behavior although #295 says it was fixed in 2.4.0?