libnetwork icon indicating copy to clipboard operation
libnetwork copied to clipboard

Implement SRV records for swarm services

Open sanimej opened this issue 9 years ago • 21 comments

Implement SRV records for swarm services . libnetwork #1163 added changes in the DNS server to handle the SRV query. But the service DB update based on swarm service life cycle is not integrated yet.

sanimej avatar Jul 27 '16 12:07 sanimej

Is there a chance that Docker Swarm will publish SRV records? We could really use that extra host information in them.

Why do I ask ? In the company where I work, we are building an infrastructure consisted of Prometheus + various exporters deployed as global Swarm services (Node exporter, cAdvisor etc) to collect metrics about our hosts and containers. In the Prometheus configuration we scrape those exporters using DNS resolution to tasks.<service-name>.

But because Swarm publishes A/AAAA records we get only the IP addresses of exporter containers. If some exporter service is restarted on some host (or the host is restarted) or deployed to a new one, its IP address will be different and we cannot correctly consume the metric data by it.

This makes it difficult to render metric data by hosts and even more difficult to trace problems related to a specific host/service, because we got to trace by container IP address where the concrete exporter's service is deployed.

Prometheus supports SRV records and exposes a label called __meta_dns_name which could be used to relabel the metric's instance, making it possible to include the hostname in the scraped metrics.

sandekar avatar Jun 12 '17 19:06 sandekar

+1 for this. My use case is using the new Haproxy 1.8.0 feature that automatically configures its servers based on SRV records:

DNS SRV records (Olivier Houchard) : in order to go a bit further with DNS resolution, SRV records were implemented. The address, port and weight attributes will be applied to servers. New servers are automatically added provided there are enough available templates, and servers which disappear are automatically removed from the farm. By combining server templates and SRV records, it is now trivial to perform service discovery.

https://www.mail-archive.com/[email protected]/msg28004.html

johan-adriaans avatar Dec 05 '17 13:12 johan-adriaans

+1, would want this regardless of swarm, just basic network

xenoterracide avatar Jan 04 '18 22:01 xenoterracide

More information about potential use with HAProxy here : https://www.haproxy.com/blog/dns-service-discovery-haproxy/

blop avatar Jan 11 '18 19:01 blop

I wanted to clarify something about the HAProxy use case mentioned by @blop and @johan-adriaans: are you launching services using swarm in a cluster but want to use HAProxy to perform the load balancing directly on the swarm task containers rather than go through swarm's IPVS based load balancing?

@xenoterracide I am not sure I understand what you are referring to as "just basic network"? My understanding of this issue is to implement support for DNS SRV queries for a swarm service (with the answer pointing to the service's task containers). Are you looking for SRV support against individual container names (launched directly using docker run)?

ddebroy avatar Jan 11 '18 21:01 ddebroy

"just basic network" e.g. if I'm running docker-compose (or run), no swarm, in this way I can simply run the same cluster locally.

Are you looking for SRV support against individual container names (launched directly using docker run)?

I could see name or alias, really specifically the dns alias is the best match for me, as I name that specifically so it's consistent regardless.

you launching services using swarm in a cluster but want to use HAProxy to perform the load balancing directly on the swarm task containers rather than go through swarm's IPVS based load balancing

also yes, but for other things in addition to load balancing, such as TLS termination, caching,http header addition/translation ) (I didn't even know swarm had a load balancer until 10s ago)

here's one of my compose configurations, I would expect whether in swarm or or compose, or if I started them with run, that master and slave, both which have EXPOSE 8080 that I would be able to run a query like _http._tcp.dex.master, this particulary haproxy will be responisble for both tls termination and load balancing, I have another though that needs to send updated headers, 'cause cloudfront. Of course I can see that that might require additional configuration as well, as SRV records can return different ports than their names. so you might have to be able to do so something like srv: - http:8080 in the config, or better ports: - 8080:8080:http although in this case I've no desire to actually expose the service on the public network so...

version: "3"
services:
  master:
    build: ./etc/docker/dex
    image: 927476265057.dkr.ecr.us-east-1.amazonaws.com/dex/dex:latest
    ports:
      - 8001:8000
      - 1098:1098
    volumes:
      - ./dex-ui/target:/mnt/ui
    environment:
      - JMX_PORT=1098
      - HIBERNATE_SEARCH_DEFAULT_WORKER_BACKEND=jgroupsMaster
    networks:
      cluster:
        aliases:
          - dex.master
  slave:
    build: ./etc/docker/dex
    image: 927476265057.dkr.ecr.us-east-1.amazonaws.com/dex/dex:latest
    ports:
      - 8002:8000
      - 1099:1099
    volumes:
      - ./dex-ui/target:/mnt/ui
    environment:
      - JMX_PORT=1099
      - HIBERNATE_SEARCH_DEFAULT_WORKER_BACKEND=jgroupsSlave
    networks:
      cluster:
        aliases:
          - dex.slave
  lb:
    build:
        context: etc/docker/load-balancer
        args:
        - CONFIG=haproxy.cfg
    ports:
    - 80:80
    - 443:443
    networks:
      cluster:

xenoterracide avatar Jan 11 '18 23:01 xenoterracide

@ddebroy Yes. We need L7 HTTP routing with advanced rules and processing that only a full HTTP proxy like HAProxy offers. We're currently using the https://github.com/docker/dockercloud-haproxy but it has some bugs and is not maintained anymore.

I'd be nice to have a native and lightweight integration so that the proxy can reconfigure itself when some services are scaled/started/stopped. HAProxy 1.8 introduces support for automatic reconfiguration using SRV records. As docker currently provides all the discovery through DNS that would be even better than the current method used by dockercloud which requires access to the docker API.

blop avatar Jan 11 '18 23:01 blop

@ddebroy

I wanted to clarify something about the HAProxy use case mentioned by @blop and @johan-adriaans: are you launching services using swarm in a cluster but want to use HAProxy to perform the load balancing directly on the swarm task containers rather than go through swarm's IPVS based load balancing?

Yes, for running multiple web apps on a single swarm cluster. The HAProxy container is running in vip mode, then routing to different apps via connected overlay networks.

cjbottaro avatar Mar 09 '18 19:03 cjbottaro

This would be a very cool feature i am currently missing both on swarm and local docker-compose!

/push

nhh avatar May 28 '18 19:05 nhh

Came across an issue where SRV records would be extremely helpful to solve my problem. Any idea when this will be implemented?

burgoyn1 avatar Jul 12 '18 09:07 burgoyn1

+1 here! I am configuring a Ceph cluster to run exclusively on Swarm. It has a recent feature for searching for monitor nodes using SRV records and I did miss this here. I wish I could declare a label on a service (say, "srv-dns=ceph-mon") and Swarm would automatically add the service name to its internal DNS.

flaviostutz avatar Jul 18 '18 16:07 flaviostutz

+1 here

Would like to use several records to have readable metrics.

phoenix741 avatar Aug 15 '18 13:08 phoenix741

+1

It would be great to use endpoint_mode: vip, but also resolve the tasks associated with a service via dns.

phifty avatar Dec 05 '18 09:12 phifty

+1

I'd like to deploy an ETCD cluster that can be discovered using SRV records.

osegarra avatar May 03 '19 15:05 osegarra

+1 I'd like to use Swarm's own discovery and decrease complexity

gboddin avatar Jun 05 '19 08:06 gboddin

We also need this to use with Prometheus, not only with Swam but with regular docker-compose.

GCSBOSS avatar Feb 10 '20 17:02 GCSBOSS

+1 seems like a logical thing. How are people even using etcd without this?

chrisbecke avatar Jun 17 '20 12:06 chrisbecke

So is this dead?

doctorpangloss avatar Jul 20 '20 20:07 doctorpangloss

+1 I need it with Prometheus as well~~

UchihaYuki avatar May 09 '21 07:05 UchihaYuki

I need HAProxy because I need to inject some http headers in the request before querying the backend (graylog). Also for fine-grained control of each backend usage. Graylog can notify HAProxy of its own state so it can fake its own death and prevent HAProxy for opening new connections.

juanjo-vlc avatar Jul 05 '21 07:07 juanjo-vlc

Another plus one here. I tried to setup a RabbitMQ cluster in Swarm, and service discovery was a major pain point. Having a way to add SRV records would alleviate this.

Radiergummi avatar Mar 23 '24 18:03 Radiergummi