AKS icon indicating copy to clipboard operation
AKS copied to clipboard

[Feature] Overriding the cache option in coredns configuration in AKS

Open mblaschke-daimlertruck opened this issue 2 years ago • 30 comments
trafficstars

Is your feature request related to a problem? Please describe. Similar to the overriding of the forward option we would like to overwrite the cache option (or the whole default configuration) in AKS.

We would like to extend CoreDNS caching as Azure DNS resolver sometimes fails and causes outages in services. CoreDNS should cache entries longer and also enable prefetch as this seems to be completely disabled and reduces DNS lookup performance for all managed AKS clusters.

Current config:

    .:53 {
        errors
        ready
        health {
          lameduck 5s
        }
        kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
          ttl 30
        }
        prometheus :9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
        import custom/*.override
    }

you cannot override already existing statements in import custom/*.override.

see also #3232

Describe the solution you'd like Make it possible to change CoreDNS configuration.

Describe alternatives you've considered This is hardcoded, there is no alterantive.

mblaschke-daimlertruck avatar May 16 '23 16:05 mblaschke-daimlertruck

I would also vote for this.

sbickmann avatar Jun 27 '23 13:06 sbickmann

I'm using a mutating webhook on the coredns ConfigMap to achieve this. (Validating webhooks on this configmap seem to be blocked by Azure.) This is a last resort but it works.

The webhook needs to have an annotation admissions.enforcer/disabled: "true".

Example with Kyverno:

apiVersion: kyverno.io/v1
kind: Policy
metadata:
  name: coredns-mutate
  namespace: kube-system
spec:
  rules:
  - name: coredns-configmap 
    match:
      all:
      - resources:
          kinds:
          - ConfigMap
          names:
          - coredns
    mutate:
      patchStrategicMerge:
        data:
          Corefile: |
            .:53 {
                errors
                ready
                health {
                  lameduck 5s
                }
                kubernetes cluster.local in-addr.arpa ip6.arpa {
                  pods insecure
                  fallthrough in-addr.arpa ip6.arpa
                  ttl 30
                }
                prometheus :9153
                forward . /etc/resolv.conf
                cache {
                  success 9984 3600
                  denial 9984 1800
                }
                loop
                reload
                loadbalance
                import custom/*.override
            }
            cluster.local:53 {
                errors
                ready
                health {
                  lameduck 5s
                }
                kubernetes cluster.local in-addr.arpa ip6.arpa {
                  pods insecure
                  fallthrough in-addr.arpa ip6.arpa
                  ttl 30
                }
                prometheus :9153
                forward . /etc/resolv.conf
                cache 30
                loop
                reload
                loadbalance
            }
            import custom/*.server

lennartack avatar Feb 07 '24 11:02 lennartack

.

mblaschke-daimlertruck avatar Dec 27 '24 12:12 mblaschke-daimlertruck

When will this be fixed ? I am facing an issue with Rabbitmq Cluster deployment and really need to lower the cache to 5-10

andreistavarache avatar Jun 05 '25 08:06 andreistavarache