pglogical Sync is working with new data, but no existing data is coming over.

Sync is working with new data, but no existing data is coming over.

Open jarrettgreen opened this issue 4 years ago • 4 comments

Setup:

Provider: PG 9.4 with pglogical v2.2.2
Subscriber: PG 12 with pglogical 2.3.0

Expected Behavior: Existing data would make it's way over from the 9.4 db to the 12 db.

Actual Behavior: New inserts on the 9.4 db make it over but no existing data is transferred, or is being transfered as far as I can tell.

Instead of adding all tables, I dropped everything and only added a couple of smaller ones (1k rows rather than 1M) to test with. I can see updates trying to happen, and inserts do happen, but I cannot for the life of me get existing data transferred over.

Mar 12 '20 19:03 jarrettgreen

When trying to force sync using alter_subscription_resynchronize_table('subscription', 'table_name on the subscriber, I see the following in the provider's logs:

ERROR:  duplicate key value violates unique constraint "table_name_pkey"
DETAIL:  Key (id)=(89) already exists.
CONTEXT:  COPY table_name, line 1
STATEMENT:  COPY "public"."table_name" ("X","X","X,"...") FROM stdin

The subscriber's tables are empty and the provider doesn't have a dupe. I'm not sure whats going on here. I assume sequencing is off or something?

Mar 13 '20 21:03 jarrettgreen

Got the same issue, Publisher PG 9.4.26 (pglogical 2.3.1) -> Sub PG 12.3 (pglogical 2.3.1)

   <2020-06-03 07:03:17 UTC-[unknown]--sfdb [app:pglogical apply 16453:1559963640,pid:37201,23000]>LOG:  CONFLICT: remote UPDATE on relation schema_name_1.tab_name_1 replica identity index pk_tab_name_1 (tuple not found). Resolution: skip.
   <2020-06-03 07:03:17 UTC-[unknown]--sfdb [app:pglogical apply 16453:1559963640,pid:37201,23000]>DETAIL:  remote tuple {<data here>} in xact origin=1,timestamp=2020-06-03 07:03:17.563272+00,commit_lsn=0/B2D263B0
   <2020-06-03 07:03:17 UTC-[unknown]--sfdb [app:pglogical apply 16453:1559963640,pid:37201,23000]>CONTEXT:  apply UPDATE from remote relation schema_name_1.tab_name_1 in commit before 2DDF/B2D263B0, xid 39550428 committed at 2020-06-03 07:03:17.563272+00 (action #93) from node replorigin 1
   
   <2020-06-03 07:03:17 UTC-[unknown]--sfdb [app:pglogical apply 16453:1559963640,pid:37201,23505]>ERROR:  duplicate key value violates unique constraint "pk_tab_name_1"
   <2020-06-03 07:03:17 UTC-[unknown]--sfdb [app:pglogical apply 16453:1559963640,pid:37201,23505]>DETAIL:  Key (<fields here>)=(<data here>) already exists.
   <2020-06-03 07:03:17 UTC-[unknown]--sfdb [app:pglogical apply 16453:1559963640,pid:37201,23505]>CONTEXT:  apply multi INSERT from remote relation schema_name_1.tab_name_1 in commit before 2DDF/B2D263B0, xid 39550428 committed at 2020-06-03 07:03:17.563272+00 (action #415) from node replorigin 1
   <2020-06-03 07:03:17 UTC-[unknown]--sfdb [app:pglogical apply 16453:1559963640,pid:37201,00000]>LOG:  apply worker [37201] at slot 2 generation 2 exiting with error

   <2020-06-03 07:03:17 UTC--- [app:,pid:27881,00000]>LOG:  background worker "pglogical apply 16453:1559963640" (PID 37201) exited with exit code 1

the error in turn cause another issue with PG 12 replica:

<2020-06-03 06:57:51 UTC--- [app:,pid:26390,00000]>LOG:  restartpoint starting: wal
<2020-06-03 07:03:17 UTC--- [app:,pid:26389,XX000]>PANIC:  invalid max offset number
<2020-06-03 07:03:17 UTC--- [app:,pid:26389,XX000]>CONTEXT:  WAL redo at 51/C3CE8B88 for Heap2/MULTI_INSERT: 6 tuples flags 0x08

Is it possible that PG 12 WAL was corrupted by pglogical ? Is it safe at all to use pglogical in Pub 9.4.26 -> Sub PG 12.3 model?

Jun 03 '20 10:06 thamerlan

Looks like related to: https://github.com/2ndQuadrant/pglogical/pull/295

Feb 08 '21 06:02 bdrouvot

provider 10.4, pglogical 2.4.1 subscriber 10.2, pglogical 2.4.0

seeing the same behavior although #295 says it was fixed in 2.4.0?

Jun 29 '22 19:06 srl295

pglogical pglogical copied to clipboard

Sync is working with new data, but no existing data is coming over.

pglogical
pglogical copied to clipboard