terraform-hcloud-kube-hetzner icon indicating copy to clipboard operation
terraform-hcloud-kube-hetzner copied to clipboard

[Bug]: autoscaled nodes have more than 3 DNS entries

Open tobiasehlert opened this issue 7 months ago • 3 comments

Description

This is the error returned by multiple pods on my autoscaler nodes:

Message: Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 185.12.64.2 185.12.64.1 2a01:4ff:ff00::add:2
Reason: DNSConfigForming
Source: kubelet k3s-01-autoscaler-cax21-nbg1-302f5cae41733fd6
Type: Warning

Apparently k8s only support 3 entries for DNS servers (source #689).

When looking on both worker and autoscaler nodes, here is how the /etc/resolv.conf looks on two servers for comparison.

k3s-01-worker-cax21-nbg1-cet:/ # cat /etc/resolv.conf 
nameserver 185.12.64.1
nameserver 185.12.64.2
nameserver 2a01:4ff:ff00::add:1
k3s-01-autoscaler-cax21-nbg1-302f5cae41733fd6:/ # cat /etc/resolv.conf 
# Generated by NetworkManager
nameserver 185.12.64.2
nameserver 185.12.64.1
nameserver 2a01:4ff:ff00::add:2
# NOTE: the libc resolver may not support more than 3 nameservers.
# The nameservers listed below may not be recognized.
nameserver 2a01:4ff:ff00::add:1

So my initial though is that that there is something missing when booting a autoscaler node that gets bootstraped with cloudinit_config, but what I haven't figured out yet.

Kube.tf file

module "kube-hetzner" {
  source  = "kube-hetzner/kube-hetzner/hcloud"
  version = "2.14.1"

  // provider and hcloud token config
  providers = {
    hcloud = hcloud
  }
  hcloud_token = var.hcloud_token

  // ssh key parameters
  ssh_public_key    = hcloud_ssh_key.tibiadata_ssh_key["tobias_ed25519"].public_key
  ssh_private_key   = null
  hcloud_ssh_key_id = hcloud_ssh_key.tibiadata_ssh_key["tobias_ed25519"].id

  // network parameters
  existing_network_id = [hcloud_network.net.id]
  network_ipv4_cidr   = module.net_k3s.base_cidr_block
  cluster_ipv4_cidr   = module.net_k3s.network_cidr_blocks.cluster
  service_ipv4_cidr   = module.net_k3s.network_cidr_blocks.service
  cluster_dns_ipv4    = cidrhost(module.net_k3s.network_cidr_blocks.service, 10)

  // control plane nodepools
  control_plane_nodepools = [
    for location in ["fsn1", "hel1", "nbg1", ] : {
      name        = "control-plane-${location}",
      server_type = "cax11",
      location    = location,
      labels      = [],
      taints      = [],
      count       = 1
    }
  ]

  agent_nodepools = concat(
    # egress nodepool
    [for location in [for dc in data.hcloud_datacenter.ds : dc.location.name] : {
      // [for location in ["fsn1", "hel1", "nbg1"] : {
      name        = "egress-cax11-${location}",
      server_type = "cax11",
      location    = location,
      labels = [
        "node.kubernetes.io/role=egress"
      ],
      taints = [
        "node.kubernetes.io/role=egress:NoSchedule"
      ],
      floating_ip = true
      count       = 1
    }],

    # worker nodepools (dynamically created)
    [for location in [for dc in data.hcloud_datacenter.ds : dc.location.name] : {
      // [for location in ["fsn1", "hel1", "nbg1"] : {
      name        = "worker-cax21-${location}",
      server_type = "cax21",
      location    = location,
      labels      = [],
      taints      = [],
      count       = 2
    }]
  )

  autoscaler_nodepools = concat(
    [for location in [for dc in data.hcloud_datacenter.ds : dc.location.name] : {
      // [for location in ["fsn1", "hel1", "nbg1"] : {
      name        = "autoscaler-cax21-${location}",
      server_type = "cax21",
      location    = location,
      min_nodes   = 1,
      max_nodes   = 2,
      labels = {
        "node.kubernetes.io/role" : "autoscaler",
      },
      taints = [{
        key : "node.kubernetes.io/role",
        value : "autoscaler",
        effect : "NoSchedule",
      }],
    }]
  )

  # firewall whitelisting (for Kube API and SSH)
  firewall_kube_api_source = [for ip in tolist(var.firewall_whitelisting.kube) : "${ip}/32"]
  firewall_ssh_source      = [for ip in tolist(var.firewall_whitelisting.ssh) : "${ip}/32"]

  # cluster generic
  cluster_name        = "k3s-01"
  additional_tls_sans = ["k3s-01.${var.fqdn_domain}"]
  base_domain         = "k3s-01.${var.fqdn_domain}"
  cni_plugin          = "cilium"
  disable_kube_proxy  = true # kube-proxy is replaced by cilium (set in cilium_values)

  # cilium parameters
  cilium_version = "v1.15.1"
  cilium_values  = <<EOT
ipam:
  mode: kubernetes
k8s:
  requireIPv4PodCIDR: true
kubeProxyReplacement: true
kubeProxyReplacementHealthzBindAddr: "0.0.0.0:10256"
k8sServiceHost: "127.0.0.1"
k8sServicePort: "6444"
routingMode: "native"
ipv4NativeRoutingCIDR: "${module.net_k3s.network_cidr_blocks.cluster}"
installNoConntrackIptablesRules: true
endpointRoutes:
  enabled: true
loadBalancer:
  acceleration: native
bpf:
  masquerade: true
encryption:
  enabled: true
  nodeEncryption: true
  type: wireguard
egressGateway:
  enabled: true
MTU: 1450
  EOT

  # Hetzner delete protection
  enable_delete_protection = {
    floating_ip = true
  }

  # various parameters
  ingress_controller   = "none"
  enable_cert_manager  = false
  block_icmp_ping_in   = true
  create_kubeconfig    = false
  create_kustomization = false
}

Screenshots

No response

Platform

Linux

tobiasehlert avatar Jul 24 '24 13:07 tobiasehlert