couchdb icon indicating copy to clipboard operation
couchdb copied to clipboard

docker/kubrenetes nodes discovery

Open sergey-safarov opened this issue 6 years ago • 9 comments

Summary

I can create docker stack using this file

version: "3.7"

networks:
  main:

services:
  couchdb:
    image: couchdb:2.3.1
    environment:
    - NODENAME=couchdb-{{.Task.Slot}}
    - COUCHDB_SECRET=monster
    - ERL_FLAGS="-setcookie monster"
    ports:
    - "5984"
    - "5986"
    networks:
    - main
    deploy:
      replicas: 5

And command

docker stack deploy --compose-file docker-compose.yaml kazoo

This will create

  1. kazoo_main network
  2. couchdb service
  3. five couchdb containers with unique names

Could you implement following logic for couchdb node discovery and automatic cluster formation.

resolv ${SEED_DNS_NAME}

root@dbef6f82ce5f:/# nslookup ${SEED_DNS_NAME}
Server:		127.0.0.11
Address:	127.0.0.11#53

Non-authoritative answer:
Name:	tasks.couchdb
Address: 10.0.52.7
Name:	tasks.couchdb
Address: 10.0.52.6
Name:	tasks.couchdb
Address: 10.0.52.5
Name:	tasks.couchdb
Address: 10.0.52.4
Name:	tasks.couchdb
Address: 10.0.52.3

Now we can couchdb nodes names

root@dbef6f82ce5f:/# curl http://10.0.52.3:5984/_membership
{"all_nodes":["couchdb@couchdb-5"],"cluster_nodes":["couchdb@couchdb-5"]}
root@dbef6f82ce5f:/# curl http://10.0.52.4:5984/_membership
{"all_nodes":["couchdb@couchdb-1"],"cluster_nodes":["couchdb@couchdb-1"]}
root@dbef6f82ce5f:/# curl http://10.0.52.5:5984/_membership
{"all_nodes":["couchdb@couchdb-2"],"cluster_nodes":["couchdb@couchdb-2"]}
root@dbef6f82ce5f:/# curl http://10.0.52.6:5984/_membership
{"all_nodes":["couchdb@couchdb-3"],"cluster_nodes":["couchdb@couchdb-3"]}
root@dbef6f82ce5f:/# curl http://10.0.52.7:5984/_membership
{"all_nodes":["couchdb@couchdb-4"],"cluster_nodes":["couchdb@couchdb-4"]}

Now we can join cluster nodes into cluster using commands

curl -X PUT "http://localhost:5986/_nodes/couchdb@couchdb-1/10.0.52.4" -d {}
curl -X PUT "http://localhost:5986/_nodes/couchdb@couchdb-2/10.0.52.5" -d {}
curl -X PUT "http://localhost:5986/_nodes/couchdb@couchdb-3/10.0.52.6" -d {}
curl -X PUT "http://localhost:5986/_nodes/couchdb@couchdb-4/10.0.52.7" -d {}
curl -X PUT "http://localhost:5986/_nodes/couchdb@couchdb-5/10.0.52.3" -d {}

Here I suggest extended command join syntax to use runtime containers IP address.

If couchdb node is restarted, then

  1. node again resolve ${SEED_DNS_NAME} and gets current couchdb nodes IP addresses;
  2. request /_node/_local/_system resource from each IP and gets couchdb node name behind each IP
  3. update own map of couchdb nodes and their IP addresses

Additional context

This feature may be used in docker swarm and kubernetes clouds. In kubernetes need use statefulset + service like

# file contains database headless service
# creates kubernetes dns records for database daemons
# required for database nodes discovery
apiVersion: v1
kind: Service
metadata:
  name: db
spec:
  type: ClusterIP
  clusterIP: None
  publishNotReadyAddresses: true
  selector:
    app: db

Kubernetes service yaml file must contain publishNotReadyAddresses: true

sergey-safarov avatar Aug 25 '19 21:08 sergey-safarov

Have you seen https://github.com/apache/couchdb/pull/1658 which landed for CouchDB 2.3.0?

There is an open ticket to implement this in the Docker image, but perhaps you can glue it together yourself with a helm file or similar...

wohali avatar Aug 26 '19 06:08 wohali

Yes, i looked this. This not applicable, because DNS nodes name is defined at run time and cannot be know at configuration time. Also during execution time this DNS nodes names may be changed. That is means need to track this nodes name change. Referenced PR is not provides this feature.

sergey-safarov avatar Aug 26 '19 09:08 sergey-safarov

@sergey-safarov And using IP addresses and setting that configuration after the nodes come up isn't acceptable?

@kocolosk does any of what Sergey asks for above seem useful to you, or is the existing beta helm chart the best solution until CouchDB 4.0?

wohali avatar Aug 26 '19 22:08 wohali

I cannot use IP in node names, because shards will be linked to nodes that contains IP addresses. When couchdb containers is restrarted, then container get new IP address and we lost restarted nodes from cluster.

sergey-safarov avatar Aug 27 '19 05:08 sergey-safarov

That useful because used DNS nodes discovery mechanism and may be used in docker swarm installation, any other cloud provider where used dynamic ip.

sergey-safarov avatar Aug 27 '19 05:08 sergey-safarov

As option Command to join node into cluster may looks

curl -X PUT "http://localhost:5986/_nodes/couchdb@couchdb-1" -d '{"ipv4_address":"10.0.52.4"}'

sergey-safarov avatar Aug 27 '19 05:08 sergey-safarov

When it comes to Kubernetes I think relying on the stable DNS names provided by membership in a StatefulSet is the right approach. This is what the Helm chart does. It is possible to compute those names ahead of time; the only bit of configuration currently needed is the domain suffix used by the target cluster.

I’m not so familiar with Docker Stacks. Are the couchdb-1, etc. names stable and resolvable by the other replicas? If so, shouldn’t we use those for node names and ignore the ephemeral IP addresses?

kocolosk avatar Aug 27 '19 12:08 kocolosk

Names is stable because name is generated from constant string and stable variable {{.Task.Slot}}. But this names is not resolvable. That is reason why this feature request opened.

I suggest store host names and runtime IP mapping on each node. When some nodes restarted, then this nodes initiate host-IP mapping update on other host in cluster. List of all nodes in cluster is resolvable via DNS name like

root@dbef6f82ce5f:/# nslookup tasks.couchdb
Server:		127.0.0.11
Address:	127.0.0.11#53

Non-authoritative answer:
Name:	tasks.couchdb
Address: 10.0.52.7
Name:	tasks.couchdb
Address: 10.0.52.6
Name:	tasks.couchdb
Address: 10.0.52.5
Name:	tasks.couchdb
Address: 10.0.52.4
Name:	tasks.couchdb
Address: 10.0.52.3

sergey-safarov avatar Aug 27 '19 14:08 sergey-safarov

I have tested the DNS name mapping to the IP addresses in 3.5.1 version. It works now as expected. The remaining part discovers cluster nodes via K8s StatefulSet as suggested by @kocolosk above.

sergey-safarov avatar Dec 06 '25 05:12 sergey-safarov