pglogical icon indicating copy to clipboard operation
pglogical copied to clipboard

could not open directory "pg_replslot/b_master_master2slave": No such file or directory

Open pashagolub opened this issue 6 years ago • 8 comments

Environment

Windows 10, pglogical 2.2.0 built with MSYS2, PostgreSQL 10.3

Installation

On provider:

create table foo(
  id bigserial primary key,
  val text
);

insert into foo values (default, 'first line'), (default, 'second line');

select pglogical.create_node(
  node_name := 'master',
  dsn := 'host=192.168.1.234 port=5432 dbname=foodb user=postgres'
);

select pglogical.create_replication_set('foo_set');

select pglogical.replication_set_add_table(
  set_name := 'foo_set',
  relation := 'foo',
  synchronize_data := true
);

On subscriber:

create table foo(
  id bigserial primary key,
  val text
);

select pglogical.create_node(
  node_name := 'slave',
  dsn := 'host=192.168.1.2 port=5432 dbname=foodb user=postgres'
);

select pglogical.create_subscription(
  subscription_name := 'master-to-slave',
  replication_sets := array['foo_set'],
  provider_dsn := 'host=192.168.1.234 port=5432 dbname=foodb user=postgres'
);

Data synchronized. But after that replication fails constantly for new rows.

psql session on master

localhost:5432 postgres@foodb=# select pglogical.create_node(
  node_name := 'master',
  dsn := 'host=192.168.1.234 port=5432 dbname=foodb user=postgres'
);
 create_node
-------------
  3582890138
(1 row)

localhost:5432 postgres@foodb=# select pglogical.create_replication_set('foo_set');
 create_replication_set
------------------------
              667091283
(1 row)

localhost:5432 postgres@foodb=# select pglogical.replication_set_add_table(
  set_name := 'foo_set',
  relation := 'foo',
  synchronize_data := true
);
 replication_set_add_table
---------------------------
 t
(1 row)

psql session on slave

192.168.1.2:5432 postgres@foodb=# select pglogical.create_node(
foodb(#   node_name := 'slave',
foodb(#   dsn := 'host=192.168.1.2 port=5432 dbname=foodb user=postgres'
foodb(# );
 create_node
-------------
   777843705
(1 row)

192.168.1.2:5432 postgres@foodb=# select pglogical.create_subscription(
  subscription_name := 'master2slave',
  replication_sets := array['foo_set'],
  provider_dsn := 'host=192.168.1.234 port=5432 dbname=foodb user=postgres'
);
 create_subscription
---------------------
          1770714438
(1 row)

192.168.1.2:5432 postgres@foodb=# SELECT FROM foo;
--
(2 rows)

Log from master

2018-05-05 17:00:07.105 CEST [5144] LOG:  manager worker [5144] at slot 0 generation 5 detaching cleanly
2018-05-05 17:00:07.140 CEST [728] LOG:  manager worker [728] at slot 1 generation 3 detaching cleanly
2018-05-05 17:00:07.193 CEST [7416] LOG:  starting pglogical database manager for database foodb
2018-05-05 17:01:19.070 CEST [8480] LOG:  logical decoding found consistent point at 0/1D52DB0
2018-05-05 17:01:19.070 CEST [8480] DETAIL:  There are no running transactions.
2018-05-05 17:01:19.071 CEST [8480] LOG:  exported logical decoding snapshot: "00000008-00000002-1" with 0 transaction IDs
2018-05-05 17:01:19.336 CEST [1604] ERROR:  could not open directory "pg_replslot/b_master_master2slave": No such file or directory
2018-05-05 17:01:19.336 CEST [1604] CONTEXT:  slot "pgl_foodb_master_master2slave", output plugin "pglogical_output", in the startup callback
2018-05-05 17:01:19.341 CEST [1604] LOG:  could not receive data from client: An existing connection was forcibly closed by the remote host.

	
2018-05-05 17:01:24.536 CEST [1020] ERROR:  could not open directory "pg_replslot/b_master_master2slave": No such file or directory
2018-05-05 17:01:24.536 CEST [1020] CONTEXT:  slot "pgl_foodb_master_master2slave", output plugin "pglogical_output", in the startup callback
2018-05-05 17:01:24.541 CEST [1020] LOG:  could not receive data from client: An existing connection was forcibly closed by the remote host.

	
2018-05-05 17:01:29.667 CEST [9388] ERROR:  could not open directory "pg_replslot/b_master_master2slave": No such file or directory
2018-05-05 17:01:29.667 CEST [9388] CONTEXT:  slot "pgl_foodb_master_master2slave", output plugin "pglogical_output", in the startup callback
2018-05-05 17:01:29.672 CEST [9388] LOG:  could not receive data from client: An existing connection was forcibly closed by the remote host.
....

Log from slave

2018-05-05 17:03:09.334 CEST [2704] LOG:  starting apply for subscription master2slave
2018-05-05 17:03:09.677 CEST [2704] WARNING:  Failed to clean up pglogical temporary dump file "C:/Windows/ServiceProfiles/NetworkService/AppData/Local/Temp/pglogical-2704.dump" on exit/error
2018-05-05 17:03:09.786 CEST [2704] ERROR:  data stream ended
2018-05-05 17:03:09.786 CEST [2704] LOG:  apply worker [2704] at slot 1 generation 497 exiting with error
2018-05-05 17:03:09.802 CEST [992] LOG:  worker process: pglogical apply 16561:1770714438 (PID 2704) exited with exit code 1
2018-05-05 17:03:14.825 CEST [984] LOG:  starting apply for subscription master2slave
2018-05-05 17:03:14.997 CEST [984] ERROR:  data stream ended
2018-05-05 17:03:14.997 CEST [984] LOG:  apply worker [984] at slot 1 generation 498 exiting with error
2018-05-05 17:03:14.997 CEST [992] LOG:  worker process: pglogical apply 16561:1770714438 (PID 984) exited with exit code 1
2018-05-05 17:03:20.020 CEST [2640] LOG:  starting apply for subscription master2slave
2018-05-05 17:03:20.114 CEST [2640] ERROR:  data stream ended
2018-05-05 17:03:20.114 CEST [2640] LOG:  apply worker [2640] at slot 1 generation 499 exiting with error
2018-05-05 17:03:20.129 CEST [992] LOG:  worker process: pglogical apply 16561:1770714438 (PID 2640) exited with exit code 1
2018-05-05 17:03:25.152 CEST [1332] LOG:  starting apply for subscription master2slave
2018-05-05 17:03:25.792 CEST [1332] ERROR:  data stream ended
2018-05-05 17:03:25.792 CEST [1332] LOG:  apply worker [1332] at slot 1 generation 500 exiting with error
2018-05-05 17:03:25.792 CEST [992] LOG:  worker process: pglogical apply 16561:1770714438 (PID 1332) exited with exit code 1
....

pashagolub avatar May 05 '18 15:05 pashagolub

I am not sure if #165 will really fix the problem, but I think it is a good idea anyway.

laurenz avatar May 14 '18 11:05 laurenz

Looks unrelated. The issue there seems to pertain to pglogical's dump/reload during pglogical.create_subscription instead; you're looking for issues around pglogical_sync_tmpfile_cleanup_cb. Adding a %m would be a big help, so I committed that as a starting point.

ringerc avatar Jul 03 '18 07:07 ringerc

Thanks!

laurenz avatar Jul 03 '18 14:07 laurenz

HI, I'm on the version 2.2.1 compiled by @pashagolub with exactly same errors. Was this issue fuxed in 2.2.2? Is build 2.2.2 available for Windows somewhere?

andrewg78 avatar Nov 05 '19 15:11 andrewg78

@andrewg78 I'm not aware about that. You may try to compile it by yourself using msys2 environment if you're in a hurry.

pashagolub avatar Nov 06 '19 09:11 pashagolub

@pashagolub Thank you. You wrote a great article! I compiled 2.2.2 but this time I have an issue with pglogical_create_subscriber.exe (pglogical_create_subscriber_postgres.log is empty!!) Are you maybe aware of 'pg_ctl' access denied issue ? Here is the console output: Creating base backup of the remote node... 26107/26107 Ko (100%), 1/1 tablespace Creating restore point "pglogical_create_subscriber_32bf5b92" on remote node ...

Bringing subscriber node to the restore point ... Access is denied. pg_ctl : could not start the server Examine the application log.

andrewg78 avatar Nov 17 '19 18:11 andrewg78

Sounds like Windows file permissions. You can try to run pglogical_create_subscriber.exe under admin account or under an account which owns server process.

pashagolub avatar Nov 18 '19 15:11 pashagolub

Environment

Windows 10, pglogical 2.3.4 built with MSYS2, PostgreSQL 13.3

Comments

Using OP's excellent blog posts on getting pglogical to compile under Windows using MSYS2, I still see the original error message. I've determined it is chopping off the first 4 characters of the database name which is why it isn't found.

I've been unable to track down where this INFO: message is coming from. In my logs the line starts with "INFO:", all other log entries start with date and time. Example: INFO: could not open directory "pg_replslot/db1234_master_master_to_slave": No such file or directory

The database name is testdb1234--"test" is chopped-off.

I'm hoping since it is an INFO: message it isn't really a problem. Replication works fine. Would be nice to get rid of the message though.

ljackwilson avatar Jun 03 '21 15:06 ljackwilson