dns
dns copied to clipboard
Only one A record set for headless service with pods having single hostname.
/kind bug
What happened When a headless service is created to point to pods which share a single hostname, (which happens, for example, when the hostname field was set in the template of a deployment/replicaset)
- Only one A record is returned for the service DNS name
- A pod DNS name is generated based on this host name, which points to a single pod
What was expected to happen
- Return A records for all available endpoints on the service DNS name
- Not sure what the correct behaviour should be for the pod dns name, either also return multiple A records, or don't create at all.
Seems this has to do with the following code: https://github.com/kubernetes/dns/blob/master/pkg/dns/dns.go#L490
Then the endpointName will be equal for any pod in the service which has the same hostname, so the entry in subCache will be overwritten.
How to reproduce
Apply the following spec:
apiVersion: v1
kind: List
items:
- apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: depl-1
spec:
replicas: 2
template:
metadata:
labels:
app: depl-1
spec:
hostname: depl-1-host
subdomain: depl-1-service
containers:
- name: test
args:
- bash
stdin: true
tty: true
image: debian:jessie
- apiVersion: v1
kind: Service
metadata:
name: depl-1-service
spec:
clusterIP: None
selector:
app: depl-1
ports:
- port: 5000
Resolving the hostnames gives back but a single A record.
# host depl-1-host.depl-1-service.default.svc.cluster.local
depl-1-host.depl-1-service.default.svc.cluster.local has address 10.56.0.140
# host depl-1-service.default.svc.cluster.local
depl-1-service.default.svc.cluster.local has address 10.56.0.140
PTR records ARE being created for all the pods, all resolving back to the single hostname. This is expected behaviour.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Prevent issues from auto-closing with an /lifecycle frozen comment.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale
This is a real bug that I expect people to hit. :( /lifecycle frozen
Yep, just ran into this today, while trying to use a headless service to permit direct access to pods in a replicaset. I tried working around it by omitting the hostname from pod definition, but kubedns just ignored the pod all together.
For what it's worth, our use case is to provide a mechanism to pass instructions to individual pods within the replicaset (to poll status, instruct it to quiesce, etc) and absent a way to address the pod via dns, we have to hack around the problem (in our case, telling the pod to publish its IP address to our database.) . Looking forward to seeing this resolved.
Oh my god I need this feature .i am happy to help if someone can give me pointers
/assign @krmayankk
i need this feature urgently and would like to help fix it . Who would be the right person to engage with to fix this @thockin
I believe we verified that this already works in CoreDNS. Check out https://github.com/coredns/deployment/tree/master/kubernetes to see how to deploy it.
thanks @johnbelamaric could you point to code where this feature got implemented in coredns. We are using 1.9 where coredns is alpha. I would love to move to coredns when this is made seamless to kube-dns
I don’t think there is specific code to handle this case, it just falls out of how we process the services, endpoints and requests. I think we might have a test for it - @chrisohaver do you know?
It's not a special case per se, the code continues to look for all endpoints that match the query name. It doesn't stop when it finds the first match. I just added a test for it... coredns/coredns#1811