k3s icon indicating copy to clipboard operation
k3s copied to clipboard

Pods can't resolve DNS

Open jkleinkauff opened this issue 1 year ago • 3 comments
trafficstars

Environmental Info: K3s Version: k3s version v1.28.8+k3s1 (653dd61a) go version go1.21.8

Node(s) CPU architecture, OS, and Version: (docker agent)Linux 0b5bf15de242 6.6.16-linuxkit #1 SMP Fri Feb 16 11:54:02 UTC 2024 aarch64 GNU/Linux

Cluster Configuration:

  • My cluster contains there rpis boards and they are working pretty well. Now I'm trying to add a new agent using docker
  • This new agent is running on my machine (macOS 14.3.1 23D60 arm64)

Describe the bug: The pods created on this new agent can't resolve dns. The way I'm testing is exec into the pod and nslookup something

Steps To Reproduce: The docker-compose file I'm using is the following:

services:
  agent:
    image: "rancher/k3s:${K3S_VERSION:-latest}"
    command:
      - agent
      - --node-name
      - macmachine-docker1
      - --node-label
      - "machine=macmachine"
      - --node-taint
      - "isMacMachine=true:NoSchedule"
      - --resolv-conf
      - "/var/lib/rancher/k3s/agent/etc/resolv.conf"
    tmpfs:
    - /run
    - /var/run
    ulimits:
      nproc: 65535
      nofile:
        soft: 65535
        hard: 65535
    privileged: true
    restart: always
    environment:
    - K3S_URL=https://192.168.15.180:6443
    - K3S_TOKEN=${K3S_TOKEN:?err}
    - K3D_FIX_DNS=1
    volumes:
    - k3s-agent:/var/lib/rancher/k3s

The --resolv-conf value is probably wrong. I'm just playing with the parameter. I have seen other similar issues but I just can't make it work. Sorry for opening another similar issue.

When deploying the pod with hostNetwork: true dns resolution is working as expected.

Expected behavior: My pods could resolve any name.

Actual behavior: nslookup inside the pod:

$ nslookup google.com
;; connection timed out; no servers could be reached

Additional context / logs: /etc/resolv.conf inside the pod:

search default.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.43.0.10
options ndots:5

agent:

nameserver 127.0.0.11
options ndots:0

jkleinkauff avatar Apr 11 '24 00:04 jkleinkauff

I see you have K3D environment variables set in your docker-compose variables. Are you using k3d, or k3s directly?

I will also note that the k3s docker image is not meant to be used for multi-host clusters; it is meant for running a single-node docker container, or a multi-node cluster where are the nodes are just containers on the same host. If you want to distribute your cluster across multiple hosts, install K3s on the host directly (or on a VM if using mac/windows).

brandond avatar Apr 11 '24 15:04 brandond

@brandond yeah, while I was searching the problem I came across a similar thread in k3d and I was just testing it. But you are right it would make more sense to deploy directly on the machine. The reasoning was just to test and play with different "nodes" in my cluster. Thank you!

jkleinkauff avatar Apr 11 '24 17:04 jkleinkauff

@brandond Sorry for the callout, just want to vent here since is the same question. In my host machine (mac) I created a multipass vm (ubuntu). Then I installed k3s with the installation script (agent). The node is being attached and listed, but the same problem occurs. I'm unable to access anything from inside the pods created within the new node. it's the same problem you described? Could you give me any hint on what to look for?

The command I'm running inside the mp vm: curl -sfL https://get.k3s.io | K3S_URL=https://192.168.15.180:6443 K3S_TOKEN=<mytoken> sh -s - --node-name multi-ubuntu-node-0 --node-label "machine=macmachine" --node-taint "isMacMachine=true:NoSchedule" --resolv-conf ""

NAME                  STATUS   ROLES                  AGE    VERSION
rpiworker8a           Ready    <none>                 654d   v1.29.3+k3s1
rpi2gmaster           Ready    control-plane,master   654d   v1.29.3+k3s1
multi-ubuntu-node-0   Ready    <none>                 23m    v1.29.3+k3s1
orangepi5             Ready    <none>                 452d   v1.29.3+k3s1

jkleinkauff avatar Apr 14 '24 00:04 jkleinkauff

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 45 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

github-actions[bot] avatar May 29 '24 20:05 github-actions[bot]