couchdb icon indicating copy to clipboard operation
couchdb copied to clipboard

Couchdb doesnt restart on docker swarm

Open mbesset opened this issue 3 years ago • 0 comments

Description

Hello to all,

I want to set up a couchdb cluster with docker swarm. The initialization of the cluster goes pretty well, but when I restart my stack, none of my couchdb nodes reboot. I have the following error:

{"Kernel pid terminated",application_controller,"{application_start_failure,setup,{{bad_return_value,{setup_error,\"Cluster setup timed out waiting for nodes to connect\"}},{setup_app,start,[normal,[]]}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,setup,{{bad_return_value,{setup_error,"Cluster setup timed out waiting for nodes to connect"}},{setup_app,start,[normal,[]]}}

This complet log is available here : 6gvvekwlgg3rnaf3trp4d4y8e_logs.txt

I'm running out of ideas on the problem, I've changed machines, changed versions of docker, changed versions of couchdb... i need your help !

My Docker swarm

version: "3.7"

x-couchdb-common: &couchdb-common
  image: couchdb:3.2.2
  expose:
    - "5984"
    - "9100"
    - "4369"

services:
  ## Couchdb loadBalancer
  couchdb-loadBalancer:
    image: haproxy:latest
    volumes:
      - ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg
    depends_on:
      - couchdb1
      - couchdb2
      - couchdb3
    ports:
      - 5984:5984
    networks:
      network:
        aliases:
          - couchdb.loadBalancer.cluster

  ## Couchdb Cluster Init service
  couchdb-cluster-init:
    image: cluster-init:latest
    volumes:
      - ./cluster-init.sh:/cluster-init.sh
    environment:
      - COUCHDB_USER=admin
      - COUCHDB_PASSWORD=a
    depends_on:
      - couchdb-node-1
      - couchdb-node-2
      - couchdb-node-3
      - couchdb-loadBalancer
    networks:
      network:
        aliases:
          - couchdb.init.cluster

  couchdb1:
    <<: *couchdb-common
    volumes:
      - couchdbData1:/opt/couchdb/data
      - couchdbConf1:/opt/couchdb/etc/local.d
    environment:
      - NODENAME=couchdb1.cluster
      - ERL_FLAGS=-setcookie "relax"
      - COUCHDB_USER=admin
      - COUCHDB_PASSWORD=a
      - COUCHDB_SECRET=0ef656e4-afa6-11ea-b3de-0242ac130004
    networks:
      network:
        aliases:
          - couchdb1.cluster

  couchdb2:
    <<: *couchdb-common
    volumes:
      - couchdbData2:/opt/couchdb/data
      - couchdbConf2:/opt/couchdb/etc/local.d
    environment:
      - NODENAME=couchdb2.cluster
      - ERL_FLAGS=-setcookie "relax"
      - COUCHDB_USER=admin
      - COUCHDB_PASSWORD=a
      - COUCHDB_SECRET=0ef656e4-afa6-11ea-b3de-0242ac130004
    networks:
      network:
        aliases:
          - couchdb2.cluster

  couchdb3:
    <<: *couchdb-common
    volumes:
      - couchdbData3:/opt/couchdb/data
      - couchdbConf3:/opt/couchdb/etc/local.d
    environment:
      - NODENAME=couchdb3.cluster
      - ERL_FLAGS=-setcookie "relax"
      - COUCHDB_USER=admin
      - COUCHDB_PASSWORD=a
      - COUCHDB_SECRET=0ef656e4-afa6-11ea-b3de-0242ac130004
    networks:
      network:
        aliases:
          - couchdb3.cluster

## By default this config uses default local driver,
## For custom volumes replace with volume driver configuration.
volumes:
  couchdbData1:
  couchdbConf1:
  couchdbData2:
  couchdbConf2:
  CouchdbData3:
  couchdbConf3:

networks:
  network:

My Script for setup the cluster :

#!/usr/bin/env sh

#
# CouchDB Cluster Init Service
#
# Waits for CouchDB nodes to come online, then configures the nodes in a cluster.
#

echo "Initialising a 3-node CouchDB cluster"

# Set up admin users (this has been pulled up into local.ini of the cluster-node Dockerfile)
#curl -s -X PUT http://couchdb1:5984/_node/couchdb@couchdb1/_config/admins/admin -d '"secret"'
#curl -s -X PUT http://couchdb2:5984/_node/couchdb@couchdb2/_config/admins/admin -d '"secret"'
#curl -s -X PUT http://couchdb3:5984/_node/couchdb@couchdb3/_config/admins/admin -d '"secret"'

# Check all nodes active
echo "Check all nodes active"
function waitForNode() {
  echo "Waiting for ${1}"
  NODE_ACTIVE=""
  until [ "${NODE_ACTIVE}" = "ok" ]; do
    sleep 1
    NODE_ACTIVE=$(curl -s -X GET http://${COUCHDB_USER}:${COUCHDB_PASSWORD}@${1}:5984/_up | jq -r .status)
  done
}
waitForNode couchdb1
waitForNode couchdb2
waitForNode couchdb3

# Check cluster status and exit if already set up
echo "Check cluster status and exit if already set up"
ALL_NODES_COUNT=$(curl -s  -X GET http://${COUCHDB_USER}:${COUCHDB_PASSWORD}@couchdb1:5984/_membership | jq '.all_nodes | length')
if [ "${ALL_NODES_COUNT}" -eq 3 ] ; then
  echo "CouchDB cluster already set up with ${ALL_NODES_COUNT} nodes"
  curl -s  -X GET http://${COUCHDB_USER}:${COUCHDB_PASSWORD}@couchdb1:5984/_membership | jq '.all_nodes'
  tail -f /dev/null
fi

# Configure consistent UUID on all nodes
echo "Configure consistent UUID on all nodes"
SHARED_UUID=$(curl -s -X GET http://${COUCHDB_USER}:${COUCHDB_PASSWORD}@couchdb1:5984/_uuids | jq .uuids[0])
curl -s  -X PUT http://${COUCHDB_USER}:${COUCHDB_PASSWORD}@couchdb1:5984/_node/_local/_config/couchdb/uuid -d "${SHARED_UUID}"
curl -s  -X PUT http://${COUCHDB_USER}:${COUCHDB_PASSWORD}@couchdb2:5984/_node/_local/_config/couchdb/uuid -d "${SHARED_UUID}"
curl -s  -X PUT http://${COUCHDB_USER}:${COUCHDB_PASSWORD}@couchdb3:5984/_node/_local/_config/couchdb/uuid -d "${SHARED_UUID}"

# Set up common shared secret
echo "Set up common shared secret"
SHARED_SECRET=$(curl -s -X GET http://${COUCHDB_USER}:${COUCHDB_PASSWORD}@couchdb1:5984/_uuids | jq .uuids[0])
curl -s  -X PUT http://${COUCHDB_USER}:${COUCHDB_PASSWORD}@couchdb1:5984/_node/_local/_config/couch_httpd_auth/secret -d "${SHARED_SECRET}"
curl -s  -X PUT http://${COUCHDB_USER}:${COUCHDB_PASSWORD}@couchdb2:5984/_node/_local/_config/couch_httpd_auth/secret -d "${SHARED_SECRET}"
curl -s  -X PUT http://${COUCHDB_USER}:${COUCHDB_PASSWORD}@couchdb3:5984/_node/_local/_config/couch_httpd_auth/secret -d "${SHARED_SECRET}"

# Enable cluster (looks to be redundant, as it seems configuring an admin user implicitly marks the cluster as enabled)
#curl -s  -X POST http://couchdb1:5984/_cluster_setup -H "content-type:application/json" -d '{"action":"enable_cluster","username": "'"${COUCHDB_USER}"'", "password":"'"${COUCHDB_PASSWORD}"'","bind_address":"0.0.0.0","node_count":3}'

# Configure nodes 2 and 3 on node 1
echo "Configure nodes 2 and 3 on node 1"
curl -s  -X POST http://${COUCHDB_USER}:${COUCHDB_PASSWORD}@couchdb1:5984/_cluster_setup -H "Content-Type: application/json" -d '{"action":"enable_cluster","remote_node":"couchdb2.cluster","port":"5984","username": "'"${COUCHDB_USER}"'", "password":"'"${COUCHDB_PASSWORD}"'","bind_address":"0.0.0.0","node_count":3}'
curl -s  -X POST http://${COUCHDB_USER}:${COUCHDB_PASSWORD}@couchdb1:5984/_cluster_setup -H "Content-Type: application/json" -d '{"action":"enable_cluster","remote_node":"couchdb3.cluster","port":"5984","username": "'"${COUCHDB_USER}"'", "password":"'"${COUCHDB_PASSWORD}"'","bind_address":"0.0.0.0","node_count":3}'

# Add nodes 2 and 3 on node 1
echo "Add nodes 2 and 3 on node 1"
curl -s  -X POST http://${COUCHDB_USER}:${COUCHDB_PASSWORD}@couchdb1:5984/_cluster_setup -H "Content-Type: application/json" -d '{"action":"add_node","host":"couchdb2.cluster","port":"5984","username": "'"${COUCHDB_USER}"'", "password":"'"${COUCHDB_PASSWORD}"'"}'
curl -s  -X POST http://${COUCHDB_USER}:${COUCHDB_PASSWORD}@couchdb1:5984/_cluster_setup -H "Content-Type: application/json" -d '{"action":"add_node","host":"couchdb3.cluster","port":"5984","username": "'"${COUCHDB_USER}"'", "password":"'"${COUCHDB_PASSWORD}"'"}'

# Finish cluster
echo "Finish cluster"
curl -s  -X POST http://${COUCHDB_USER}:${COUCHDB_PASSWORD}@couchdb1:5984/_cluster_setup -H "Content-Type: application/json" -d '{"action": "finish_cluster"}'

curl -s  -X POST http://admin:a@couchdb1:5984/_cluster_setup -H "Content-Type: application/json" -d '{"action": "finish_cluster"}'

# Check cluster membership
echo "Check cluster membership"
curl -s  -X GET http://${COUCHDB_USER}:${COUCHDB_PASSWORD}@couchdb1:5984/_membership | jq

# Done!
echo "Done!"
echo "Check http://localhost:5984/_haproxy_stats for HAProxy info."
echo "Use http://localhost:5984/_utils for CouchDB admin."
tail -f /dev/null

Steps to Reproduce

  • Start your stack (first time)
  • docker stack deploy -c docker-compose-swarm-couchdb.yml couchdb_swarm
  • Wait for all services to start, and for the cluster container to init create the cluster
  • Stop the stack
  • docker stack rm couchdb_swarm
  • Restart the stack
  • docker stack deploy -c docker-compose-swarm-couchdb.yml couchdb_swarm
  • CouchDB version used: 3.2.2
  • Browser name and version: Chrome
  • Operating system and version: MacOs / RedHat

Additional information :

Docker init cluster :

FROM alpine:edge

RUN apk --no-cache add \
    curl \
    jq

CMD [ "/cluster-init.sh" ]

mbesset avatar Jul 20 '22 16:07 mbesset