k3s
k3s copied to clipboard
Pods can't resolve DNS
Environmental Info: K3s Version: k3s version v1.28.8+k3s1 (653dd61a) go version go1.21.8
Node(s) CPU architecture, OS, and Version: (docker agent)Linux 0b5bf15de242 6.6.16-linuxkit #1 SMP Fri Feb 16 11:54:02 UTC 2024 aarch64 GNU/Linux
Cluster Configuration:
- My cluster contains there rpis boards and they are working pretty well. Now I'm trying to add a new agent using docker
- This new agent is running on my machine (macOS 14.3.1 23D60 arm64)
Describe the bug: The pods created on this new agent can't resolve dns. The way I'm testing is exec into the pod and nslookup something
Steps To Reproduce: The docker-compose file I'm using is the following:
services:
agent:
image: "rancher/k3s:${K3S_VERSION:-latest}"
command:
- agent
- --node-name
- macmachine-docker1
- --node-label
- "machine=macmachine"
- --node-taint
- "isMacMachine=true:NoSchedule"
- --resolv-conf
- "/var/lib/rancher/k3s/agent/etc/resolv.conf"
tmpfs:
- /run
- /var/run
ulimits:
nproc: 65535
nofile:
soft: 65535
hard: 65535
privileged: true
restart: always
environment:
- K3S_URL=https://192.168.15.180:6443
- K3S_TOKEN=${K3S_TOKEN:?err}
- K3D_FIX_DNS=1
volumes:
- k3s-agent:/var/lib/rancher/k3s
The --resolv-conf value is probably wrong. I'm just playing with the parameter. I have seen other similar issues but I just can't make it work. Sorry for opening another similar issue.
When deploying the pod with hostNetwork: true dns resolution is working as expected.
Expected behavior: My pods could resolve any name.
Actual behavior: nslookup inside the pod:
$ nslookup google.com
;; connection timed out; no servers could be reached
Additional context / logs: /etc/resolv.conf inside the pod:
search default.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.43.0.10
options ndots:5
agent:
nameserver 127.0.0.11
options ndots:0
I see you have K3D environment variables set in your docker-compose variables. Are you using k3d, or k3s directly?
I will also note that the k3s docker image is not meant to be used for multi-host clusters; it is meant for running a single-node docker container, or a multi-node cluster where are the nodes are just containers on the same host. If you want to distribute your cluster across multiple hosts, install K3s on the host directly (or on a VM if using mac/windows).
@brandond yeah, while I was searching the problem I came across a similar thread in k3d and I was just testing it. But you are right it would make more sense to deploy directly on the machine. The reasoning was just to test and play with different "nodes" in my cluster. Thank you!
@brandond Sorry for the callout, just want to vent here since is the same question. In my host machine (mac) I created a multipass vm (ubuntu). Then I installed k3s with the installation script (agent). The node is being attached and listed, but the same problem occurs. I'm unable to access anything from inside the pods created within the new node. it's the same problem you described? Could you give me any hint on what to look for?
The command I'm running inside the mp vm:
curl -sfL https://get.k3s.io | K3S_URL=https://192.168.15.180:6443 K3S_TOKEN=<mytoken> sh -s - --node-name multi-ubuntu-node-0 --node-label "machine=macmachine" --node-taint "isMacMachine=true:NoSchedule" --resolv-conf ""
NAME STATUS ROLES AGE VERSION
rpiworker8a Ready <none> 654d v1.29.3+k3s1
rpi2gmaster Ready control-plane,master 654d v1.29.3+k3s1
multi-ubuntu-node-0 Ready <none> 23m v1.29.3+k3s1
orangepi5 Ready <none> 452d v1.29.3+k3s1
This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 45 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.