kops
kops copied to clipboard
Node-local-dns doesn't work with cilium CNI on kops 1.29.0
/kind bug
1. What kops version are you running? The command kops version, will display
this information.
Client version: 1.29.0 (git-v1.29.0)
2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.
1.28.7
3. What cloud provider are you using?
AWS
4. What commands did you run? What is the simplest way to reproduce this issue? Update Kops from 1.28.4 to 1.29.0, or create a new cluster using Kops 1.29.0 with Node Local DNS and Cilium CNI.
5. What happened after the commands executed?
Pods on updated nodes cannot access node-local-dns pods
6. What did you expect to happen? Pods can access node-local-dns pods.
7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.
apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
creationTimestamp: "2024-05-31T07:47:47Z"
name: k8s.tmp-test.example
spec:
additionalSans:
- api-internal.k8s.tmp-test.example
- api.internal.k8s.tmp-test.example
api:
loadBalancer:
type: Internal
useForInternalApi: true
authentication: {}
authorization:
rbac: {}
certManager:
enabled: true
managed: false
cloudConfig:
awsEBSCSIDriver:
enabled: true
cloudProvider: aws
clusterAutoscaler:
enabled: true
configBase: s3://example/k8s.tmp-test.example
containerd:
registryMirrors:
'*':
- https://nexus-proxy.example.io
docker.io:
- https://nexus-proxy.example.io
k8s.gcr.io:
- https://nexus-proxy.example.io
public.ecr.aws:
- https://nexus-proxy.example.io
quay.io:
- https://nexus-proxy.example.io
registry.example.io:
- https://registry.example.io
etcdClusters:
- cpuRequest: 200m
etcdMembers:
- instanceGroup: master-1a
name: a
- instanceGroup: master-1b
name: b
- instanceGroup: master-1c
name: c
manager:
backupRetentionDays: 90
env:
- name: ETCD_LISTEN_METRICS_URLS
value: http://0.0.0.0:2379
- name: ETCD_METRICS
value: basic
- name: ETCD_MANAGER_HOURLY_BACKUPS_RETENTION
value: 1d
- name: ETCD_MANAGER_DAILY_BACKUPS_RETENTION
value: 1d
- name: ETCD_MAX_REQUEST_BYTES
value: "1572864"
memoryRequest: 100Mi
name: main
- cpuRequest: 100m
etcdMembers:
- instanceGroup: master-1a
name: a
- instanceGroup: master-1b
name: b
- instanceGroup: master-1c
name: c
manager:
backupRetentionDays: 90
env:
- name: ETCD_MANAGER_HOURLY_BACKUPS_RETENTION
value: 1d
- name: ETCD_MANAGER_DAILY_BACKUPS_RETENTION
value: 1d
- name: ETCD_MAX_REQUEST_BYTES
value: "1572864"
memoryRequest: 100Mi
name: events
fileAssets:
- content: |
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: RequestResponse
userGroups:
- "/devops"
- "/developers"
- "/teamleads"
- "/k8s-full"
- "/sre"
- "/support"
- "/qa"
- "system:serviceaccounts"
name: audit-policy-config
path: /etc/kubernetes/audit/policy-config.yaml
roles:
- ControlPlane
iam:
legacy: false
kubeAPIServer:
auditLogMaxAge: 10
auditLogMaxBackups: 1
auditLogMaxSize: 100
auditLogPath: /var/log/kube-apiserver-audit.log
auditPolicyFile: /etc/kubernetes/audit/policy-config.yaml
oidcClientID: kubernetes
oidcGroupsClaim: groups
oidcIssuerURL: https://sso.example.io/auth/realms/example
serviceAccountIssuer: https://api-internal.k8s.tmp-test.example
serviceAccountJWKSURI: https://api-internal.k8s.tmp-test.example/openid/v1/jwks
kubeDNS:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kops.k8s.io/instancegroup
operator: In
values:
- infra-nodes
nodeLocalDNS:
cpuRequest: 25m
enabled: true
memoryRequest: 5Mi
provider: CoreDNS
tolerations:
- effect: NoSchedule
key: dedicated/infra
operator: Exists
kubeProxy:
enabled: false
kubelet:
anonymousAuth: false
authenticationTokenWebhook: true
authorizationMode: Webhook
evictionHard: memory.available<7%,nodefs.available<3%,nodefs.inodesFree<5%,imagefs.available<10%,imagefs.inodesFree<5%
evictionMaxPodGracePeriod: 30
evictionSoft: memory.available<12%
evictionSoftGracePeriod: memory.available=200s
kubernetesApiAccess:
- 10.170.0.0/16
kubernetesVersion: 1.28.7
masterPublicName: api.k8s.tmp-test.example
metricsServer:
enabled: false
insecure: true
networkCIDR: 10.170.0.0/16
networkID: vpc-xxxx
networking:
cilium:
enableNodePort: true
enablePrometheusMetrics: true
ipam: eni
nodeProblemDetector:
enabled: true
nodeTerminationHandler:
enableSQSTerminationDraining: false
enabled: true
nonMasqueradeCIDR: 100.64.0.0/10
sshAccess:
- 10.170.0.0/16
subnets:
- cidr: 10.170.140.0/24
name: kops-k8s-1a
type: Private
zone: eu-central-1a
- cidr: 10.170.142.0/24
name: kops-k8s-eni-1a
type: Private
zone: eu-central-1a
- cidr: 10.170.141.0/24
name: kops-k8s-utility-1a
type: Utility
zone: eu-central-1a
- cidr: 10.170.143.0/24
name: kops-k8s-1b
type: Private
zone: eu-central-1b
- cidr: 10.170.145.0/24
name: kops-k8s-eni-1b
type: Private
zone: eu-central-1b
- cidr: 10.170.144.0/24
name: kops-k8s-utility-1b
type: Utility
zone: eu-central-1b
- cidr: 10.170.146.0/24
name: kops-k8s-1c
type: Private
zone: eu-central-1c
- cidr: 10.170.148.0/24
name: kops-k8s-eni-1c
type: Private
zone: eu-central-1c
- cidr: 10.170.147.0/24
name: kops-k8s-utility-1c
type: Utility
zone: eu-central-1c
topology:
dns:
type: Private
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: "2024-05-31T07:47:48Z"
labels:
kops.k8s.io/cluster: k8s.tmp-test.example
name: graviton-nodes
spec:
autoscale: false
image: ami-0192de4261c8ff06a
machineType: t4g.small
maxSize: 1
minSize: 1
mixedInstancesPolicy:
instances:
- t4g.small
onDemandAboveBase: 0
onDemandBase: 0
spotAllocationStrategy: lowest-price
spotInstancePools: 3
nodeLabels:
kops.k8s.io/instancegroup: graviton-nodes
role: Node
rootVolumeSize: 25
subnets:
- kops-k8s-1a
- kops-k8s-eni-1a
sysctlParameters:
- net.netfilter.nf_conntrack_max = 1048576
- net.core.netdev_max_backlog = 30000
- net.core.rmem_max = 134217728
- net.core.wmem_max = 134217728
- net.ipv4.tcp_wmem = 4096 87380 67108864
- net.ipv4.tcp_rmem = 4096 87380 67108864
- net.ipv4.tcp_mem = 187143 249527 1874286
- net.ipv4.tcp_max_syn_backlog = 8192
- net.ipv4.ip_local_port_range = 10240 65535
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: "2024-05-31T07:47:48Z"
labels:
kops.k8s.io/cluster: k8s.tmp-test.example
name: infra-nodes
spec:
autoscale: false
image: ami-035f7f826413ac489
machineType: t3.small
maxSize: 1
minSize: 1
mixedInstancesPolicy:
instances:
- t3.small
onDemandAboveBase: 0
onDemandBase: 0
spotAllocationStrategy: lowest-price
spotInstancePools: 3
nodeLabels:
kops.k8s.io/instancegroup: infra-nodes
role: Node
rootVolumeSize: 25
subnets:
- kops-k8s-1a
- kops-k8s-eni-1a
sysctlParameters:
- net.netfilter.nf_conntrack_max = 1048576
- net.core.netdev_max_backlog = 30000
- net.core.rmem_max = 134217728
- net.core.wmem_max = 134217728
- net.ipv4.tcp_wmem = 4096 12582912 16777216
- net.ipv4.tcp_rmem = 4096 12582912 16777216
- net.ipv4.tcp_mem = 187143 249527 1874286
- net.ipv4.tcp_max_syn_backlog = 8192
- net.ipv4.ip_local_port_range = 10240 65535
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: "2024-05-31T07:47:47Z"
labels:
kops.k8s.io/cluster: k8s.tmp-test.example
name: master-1a
spec:
image: ami-035f7f826413ac489
machineType: t3.medium
maxSize: 1
minSize: 1
nodeLabels:
kops.k8s.io/instancegroup: master-1a
role: Master
rootVolumeSize: 25
subnets:
- kops-k8s-1a
- kops-k8s-eni-1a
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: "2024-05-31T07:47:47Z"
labels:
kops.k8s.io/cluster: k8s.tmp-test.example
name: master-1b
spec:
image: ami-035f7f826413ac489
machineType: t3.medium
maxSize: 1
minSize: 1
nodeLabels:
kops.k8s.io/instancegroup: master-1b
role: Master
rootVolumeSize: 25
subnets:
- kops-k8s-1b
- kops-k8s-eni-1b
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: "2024-05-31T07:47:48Z"
labels:
kops.k8s.io/cluster: k8s.tmp-test.example
name: master-1c
spec:
image: ami-035f7f826413ac489
machineType: t3.medium
maxSize: 1
minSize: 1
nodeLabels:
kops.k8s.io/instancegroup: master-1c
role: Master
rootVolumeSize: 25
subnets:
- kops-k8s-1c
- kops-k8s-eni-1c
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: "2024-05-31T07:47:48Z"
labels:
kops.k8s.io/cluster: k8s.tmp-test.example
name: nodes
spec:
image: ami-035f7f826413ac489
machineType: t3.small
maxSize: 1
minSize: 1
mixedInstancesPolicy:
instances:
- t3.small
onDemandAboveBase: 0
onDemandBase: 0
spotAllocationStrategy: lowest-price
spotInstancePools: 3
nodeLabels:
kops.k8s.io/instancegroup: nodes
role: Node
rootVolumeSize: 25
subnets:
- kops-k8s-1a
- kops-k8s-eni-1a
sysctlParameters:
- net.netfilter.nf_conntrack_max = 1048576
- net.core.netdev_max_backlog = 30000
- net.core.rmem_max = 134217728
- net.core.wmem_max = 134217728
- net.ipv4.tcp_wmem = 4096 87380 67108864
- net.ipv4.tcp_rmem = 4096 87380 67108864
- net.ipv4.tcp_mem = 187143 249527 1874286
- net.ipv4.tcp_max_syn_backlog = 8192
- net.ipv4.ip_local_port_range = 10240 65535
8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.
9. Anything else do we need to know?
We found a workaround to fix this issue on a single node:
We noticed that the nodelocaldns interface is in a down state on nodes. ( but the same we can observe on older kops versions where node-local-dns works fine)
But after executing
ip link set dev nodelocaldns up
Nodelocaldns interface:
In cilium agent logs on this node we can see:
time="2024-05-31T09:23:08Z" level=info msg="Node addresses updated" device=nodelocaldns node-addresses="169.254.20.10 (nodelocaldns)" subsys=node-address
After these actions, all pods on this node can access node-local-dns without any problems.
Hi, Have got the same issue while upgrading Kops from 1.28 to 1.29. This is quite a critical bug/regression in Kops 1.29 which blocks us from upgrading. Any known workarounds for this?
/reopen
I wasn't able to repro the issue but I did upgrade Cilium to the latest 1.15 patch version. If you're able to build kops from source, can you build the kops CLI from this branch, run kops update cluster --yes and see if the issue is fixed?
@rifelpet: Reopened this issue.
In response to this:
/reopen I wasn't able to repro the issue but I did upgrade Cilium to the latest 1.15 patch version. If you're able to build kops from source, can you build the kops CLI from this branch, run
kops update cluster --yesand see if the issue is fixed?
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
I recreated the cluster using kops from a branch, but it didn't solve the issue.
I'm not sure whether it's connected or not but except of the nodelocaldns also not working I have an experimental IPv6-only cluster with cilium.
I've tried upgrading it from kops v1.28 to v1.29 but the endpoints in cilium are unreachable on nodes.
I've looked what's changed in cilium setup and I found that hostNetwork: true was added to both cilium-operator and cilium DaemonSet. I suspect that it's somehow connected with both issues but I couldn't find exact issue.
Is kube-dns service created in kube-system?
Hi, yes, kube-dns service is created in kube-system. Also, there is a cilium doc on how to configure node-local-dns with cilium https://docs.cilium.io/en/v1.10/gettingstarted/local-redirect-policy/#node-local-dns-cache One interesting part is that node-local-dns must run as regular pod with hostNetwork: false, what not is the case in current Kops deployment. Also, CiliumLocalRedirectPolicy must be added. Took this from this issue: https://github.com/cilium/cilium/issues/16906
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
Can we re-open this? It seems to be caused by https://github.com/cilium/cilium/pull/35098 To repro this scenario, just enable host BPF routing and tunnel on the cluster. The problem is introduced in cilium 1.16.5+ cc @rifelpet
/reopen /remove-lifecycle rotten
This should be fixed by https://github.com/kubernetes/kops/pull/17266
@rifelpet: Reopened this issue.
In response to this:
/reopen /remove-lifecycle rotten
This should be fixed by https://github.com/kubernetes/kops/pull/17266
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
Possibly related: https://docs.cilium.io/en/latest/network/kubernetes/local-redirect-policy/#node-local-dns-cache
I've run into this same issue with a new from scratch cluster using kops 1.31.1 and k8s 1.31.8.
The above article clearly states that node-local-dns isn't going to work when it's in hostNetwork mode. Special steps are required to make it work when Cilium is (as far as I understand) acting as a kube-proxy replacement.
Simply updating cilium from 1.16.5 to 1.16.9 did not solve it. I will post an update if/when I try the steps from the article linked above.
If you can confirm the changes needed for cilium and NLD to work together, we can update their manifests here:
https://github.com/kubernetes/kops/blob/3aa4bc3c714a3a972c70677f79d75d07c3b6e9c2/upup/models/cloudup/resources/addons/networking.cilium.io/k8s-1.16-v1.15.yaml.template
https://github.com/kubernetes/kops/blob/3aa4bc3c714a3a972c70677f79d75d07c3b6e9c2/upup/models/cloudup/resources/addons/nodelocaldns.addons.k8s.io/k8s-1.12.yaml.template
I've opened a fix PR for this issue @rifelpet, it should address the issue with cilium without changing the behaviour when other CNIs are used
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
this is apparently working fine in 1.34.0.
Is cilium network policy with node local dns cache works properly in kops 1.34.0?