pglogical icon indicating copy to clipboard operation
pglogical copied to clipboard

pglogical apply PID exited with exit code 1

Open jitesh-co opened this issue 2 years ago • 2 comments

Base Info

Both on source & target:

  • RDS Postgres 11.15
  • Db.m5.8xlarge
  • PGLogical v2.4.0

Example -

When we try to move data from one shard to a new instance which also has several read replicas, we get pglogical error. In our understanding, the process to copy data of a space using pglogical always consists of two main phases:

  • The initial data copy (state=sync_data). During this phase, the primary server will COPY all tables, one by one, at their current state at the state of the replication (initial checkpoint), to the target server.
  • The replication phase (state=replicating) - the target server will replay all data the primary server has written since the initial checkpoint.

The issue we observe consistently happens just after the initial data copy, when it is pivoting to the replication phase. Shortly after the subscription state goes from sync_data to replicating, it goes to state “down” (which is usually indicative of a problem) and does not recover. 2022-11-09 11:54:45 UTC::@:[352]:LOG: background worker "pglogical apply 16421:2167242526" (PID 21218) exited with exit code 1

jitesh-co avatar Nov 24 '22 16:11 jitesh-co

Could you provide a test case? Did you check for error messages before the provided log message? You need to provide more than just a subscription status. Start with:

SELECT * FROM pglogical.local_sync_status WHERE sync_status <> 'r';

eulerto avatar Dec 06 '22 20:12 eulerto

Note that if the config has pglogical.conflict_log_level = ERROR, then the apply process will exit after logging the conflict details. You should set the level to warning or notice.

dubek avatar Dec 29 '23 21:12 dubek