mesos-dns icon indicating copy to clipboard operation
mesos-dns copied to clipboard

proposal: executors, that publish DiscoveryInfo, should get A and SRV records

Open jdef opened this issue 10 years ago • 5 comments

this proposal establishes a new namespace for executor-provided services in mesos-dns:

  • {framework}.exec.{domain}

for example, in kubernetes-mesos the executor runs a "kubelet" and "kube-proxy" process, both of which are shared across all tasks on slave. each process exposes useful ports. it would be nice to somehow address the services on these ports.

it's important to remember that in kubernetes-world, every task is a pod and so each task is already assigned to its own netns (and gets its own IP address, etc). the scheduler will likely also be dynamically allocating ports for each task on the slave host and that these task ports have nothing to do with dynamically allocated executor ports.

examples of services exposed by kubelet and kube-proxy:

  • api & read-only api
  • health check
  • cadvisor

perhaps one way generate records for these executor-provided services would be:

  • A :: {framework}.exec.mesos
    • resolves to (multiple) IP address, for all running executors for the framework
  • A :: {ename}.{framework}.exec.mesos
    • resolves to (multiple) IP address of executor container named {ename}
  • A :: eid-{eid-hash}.{framework}.exec.mesos
    • resolves to (unique) IP address of executor container given a specific executor.id
  • SRV :: _{port-name}._{proto}.{framework}.exec.mesos
    • resolves to (multiple) eid-{eid-hash}.{framework}.exec.mesos:{di-port-number}
    • only generated if the requisite DiscoveryInfo is available

where (the following are required):

  • {eid-hash} --> hash-of(ExecutorInfo.id + framework-id + slave-id + other salt); for uniqueness
  • {ename} --> ExecutorInfo.discovery.name, or else ExecutorInfo.name
  • {port-name} --> ExecutorInfo.discovery.ports.ports[x].name
  • {di-port-number} --> ExecutorInfo.discovery.ports.ports[x].number
  • {proto} --> ExecutorInfo.discovery.ports.ports[x].protocol, or else both _tcp and _udp
  • {framework} --> name of the framework instance running on the cluster

the above allows us to identify the location of all kubelet API services in the cluster via:

  • _api._tcp.kubernetes.exec.mesos

furthermore, if the hash-of(.. + salt) algorithm is known to the framework, the k8s framework can refer to specific executor instances (aka k8s "nodes") via the executor-id hashed name:

  • eid-{eid-hash}.{framework}.exec.mesos

(+) this approach is compatible with multiple instances of the same framework in the same cluster, provided that they are registered with different framework names (+) this approach is compatible with multiple framework executors (for a single framework) running on a single slave, provided that they have unique names (+) this approach is compatible with the recent A and SRV record generation semantics recently introduced for the {framework}.slave.mesos namespace, established by #226

/cc @kozyraki @tsenart @sttts

jdef avatar Aug 11 '15 22:08 jdef

xref https://github.com/GoogleCloudPlatform/kubernetes/pull/11224

jdef avatar Aug 11 '15 22:08 jdef

This all sounds perfectly reasonable overall. A few questions and remarks:

  1. Is there a technical reason for prefixing the A records with eid? I'd rather have {eid-hash}.{framework}.exec.mesos.
  2. Why do we need an extra salt in {eid-hash}?
  3. I'd expect A::{framework}.exec.mesos to return all executor IPs for a given framework.
  4. Are SRV records skipped when no DI is available?
  5. This question has been itching me: Consul generates SRV records that have the same name as A records, as well as the ones with underscores. Why can't we do the same? https://www.consul.io/docs/agent/dns.html

tsenart avatar Aug 12 '15 16:08 tsenart

This all sounds perfectly reasonable overall. A few questions and remarks:

  1. Is there a technical reason for prefixing the A records with eid? I'd rather have {eid-hash}.{framework}.exec.mesos.

initial thinking was that it would help to distinguish between executor names and hashed executor ids. it doesn't really reduce the opportunity for collisions, unless people recognize that they may not want to use the eid- prefix for their executor names.

  1. Why do we need an extra salt in {eid-hash}?

may need salt to get a reasonable bit distribution in the hash function. salt isn't a hard requirement for me.

  1. I'd expect A::{framework}.exec.mesos to return all executor IPs for a given framework.

sounds good to me, i'll update the proposal.

  1. Are SRV records skipped when no DI is available?

yes, i'll make that clear in the proposal.

  1. This question has been itching me: Consul generates SRV records that have the same name as A records, as well as the ones with underscores. Why can't we do the same? https://www.consul.io/docs/agent/dns.html

we probably could. that sounds like another proposal :)

jdef avatar Aug 14 '15 15:08 jdef

Thanks for the answers. Regarding the eid- prefix: Is there really a need to distinguish between executor names and executor ids?

tsenart avatar Aug 14 '15 15:08 tsenart

Probably doesn't matter to mesos-dns; might matter to humans.

On Fri, Aug 14, 2015 at 11:39 AM, Tomás Senart [email protected] wrote:

Thanks for the answers. Regarding the eid- prefix: Is there really a need to distinguish between executor names and executor ids?

— Reply to this email directly or view it on GitHub https://github.com/mesosphere/mesos-dns/issues/233#issuecomment-131153735 .

jdef avatar Aug 14 '15 15:08 jdef