docker-openldap Ldap multi-master replication not working in 1.5.0

It seems that upgrading the openldap container to 1.5.0 breaks multi-master replication, while using the same configuration on 1.3.0 works fine. We're using the docker-compose file listed below:

version: "3"
services:
  idp-ldap_1:
    image: ldap-local
    environment:
      - LDAP_BACKUP_CONFIG_CRON_EXP=0 2 * * *
      - LDAP_BACKUP_DATA_CRON_EXP=0 2 * * *
      - 
      # tls required for ha, only used for replication. generates certificate using HOSTNAME 
      - HOSTNAME=idp-ldap_1
      - LDAP_TLS=true
      - LDAP_TLS_VERIFY_CLIENT=allow
      # LDAP parameters
      - LDAP_BACKEND=mdb
      - LDAP_ADMIN_PASSWORD=PLACEHOLDER
      - LDAP_DOMAIN=PLACEHOLDER
      - LDAP_ORGANIZATION=PLACEHOLDER
      - LDAP_REMOVE_CONFIG_AFTER_SETUP=false
      - LDAP_BASE_DN=PLACEHOLDER
      - LDAP_REPLICATION=true
      # Python to bash magic to convert this array, copied from osixia example https://github.com/osixia/docker-openldap#multi-master-replication
      - LDAP_REPLICATION_HOSTS=#PYTHON2BASH:['ldap://idp-ldap_1','ldap://idp-ldap_2']
    volumes:
      - ./data_slapd_database_1:/var/lib/ldap
      - ./data_slapd_config_1:/etc/ldap/slapd.d
      - ./changelog_1:/changelog
      - ./backup_1:/data/backup  
    ports:
      - "389:389"
    command: --copy-service --loglevel debug /resources/run.sh
    networks:
      - idp-develop
    restart: always  

volumes:
  data_slapd_database_1:
  data_slapd_config_1:
  changelog_1:
  backup_1:

networks:
  idp-develop:
    external:
      name: idp-develop`

Above configuration gives us working replication across instances. However, when we upgrade our image to 1.5.0, we get the following error:

5acb4ffa read_config: no serverID / URL match found. Check slapd -h arguments. 5acb4ffa slapd stopped.

This is without any configuration changes.

Apr 29 '21 21:04 ddaalhuisen

I am looking detail this issue. It seems that startup script try to re-generate configuration into OpenLDAP if container is new.

Try setting environment KEEP_EXISTING_CONFIG=true to skip re-generation.

Compared to 1.3.0, 1.5.0 start OpenLDAP for initialization without consideration replication configuration: https://github.com/osixia/docker-openldap/blob/32eb22ce1e9b34e893d97acf6c6cd7c6fea787ab/image/service/slapd/startup.sh#L307-L313

It missing pass ldap://ldap.example.org to -h parameter. In 1.3.0, it handles correctly:https://github.com/osixia/docker-openldap/blob/fa517c29889c8715fbcc90bd3d5f8042459c440a/image/service/slapd/startup.sh#L255-L262

May 14 '21 07:05 comphilip

Those changes were introduced with this PR.

Aug 17 '21 07:08 oschlueter

Facing the same issue. Any update on this? Also when trying to run 1.5.0 with volume for /container/service/slapd/ the container does not start.

Mar 08 '22 11:03 padma0104

Hi,

I was facing the exact same issue and after a few hours of debugging, I've found a solution. I was making the same kind of configuration that you have but with my own machine names: HOSTNAME=ldap-server1 LDAP_REPLICATION_HOSTS=#PYTHON2BASH:['ldap://ldap-server1','ldap://ldap-server2'] The hostname was provided as an argument (--hostname) to docker / podman run command.

The solution I've found is to provide fully qualified domain name such as: HOSTNAME=ldap-server1.domain.ext LDAP_REPLICATION_HOSTS=#PYTHON2BASH:['ldap://ldap-server1.domain.ext','ldap://ldap-server2.domain.ext']

Before doing this, I was not able to start OpenLDAP container with replication set to true and even worse, after making manually the whole replication process, the container refused to restart on both machine.

Now, it seems to work like a charm but the point highlighted by @comphilip seems to force the use of FQDN in order to get replication to work as expected. Both startup.sh and process.sh files contains the same kind of code to start the slapd service.

It should be the same solution for issue #602

Mar 25 '22 13:03 KokutoSan

@padma0104 I gave up multi-master replication solution and rollback to single master.

There is no multi-master replication requirement for me.

LDAP data is small in my corporation (mainly used for login authentication, network device RADIUS authentication, DNS records)
No pressure for single LDAP server handling requests from servers in two IDC centers.
LDAP data is backed everyday, the data is really small and easy to recover.

Mar 25 '22 14:03 comphilip

Hi,

I was facing the exact same issue and after a few hours of debugging, I've found a solution. I was making the same kind of configuration that you have but with my own machine names: HOSTNAME=ldap-server1 LDAP_REPLICATION_HOSTS=#PYTHON2BASH:['ldap://ldap-server1','ldap://ldap-server2'] The hostname was provided as an argument (--hostname) to docker / podman run command.

The solution I've found is to provide fully qualified domain name such as: HOSTNAME=ldap-server1.domain.ext LDAP_REPLICATION_HOSTS=#PYTHON2BASH:['ldap://ldap-server1.domain.ext','ldap://ldap-server2.domain.ext']

Before doing this, I was not able to start OpenLDAP container with replication set to true and even worse, after making manually the whole replication process, the container refused to restart on both machine.

Now, it seems to work like a charm but the point highlighted by @comphilip seems to force the use of FQDN in order to get replication to work as expected. Both startup.sh and process.sh files contains the same kind of code to start the slapd service.

It should be the same solution for issue #602

Even after adding the hostname with extension to the LDAP_REPLICATION_HOSTS I am facing the same error. Can you pls share your docker compose file or the config?

Apr 18 '22 16:04 padma0104