external-dns icon indicating copy to clipboard operation
external-dns copied to clipboard

Question: Gatway-api target on HTTPRoute

Open Rez0k opened this issue 2 years ago • 7 comments

Hi,

I am trying to create cloudflare record using the external-dns.alpha.kubernetes.io/target annotation. I followed this link: https://kubernetes-sigs.github.io/external-dns/v0.14.0/annotations/annotations/, and used the annotation on my gateway and external-dns works like a charm.

But I have an issue I want to specify one host that will target a different dns than the one specified in the gateway. I tried:

kind: Gateway
apiVersion: gateway.networking.k8s.io/v1beta1
metadata:
  name: gateway
  annotations:
    external-dns.alpha.kubernetes.io/target: <default-dns>
spec:
...
---
kind: HTTPRoute
apiVersion: gateway.networking.k8s.io/v1beta1
metadata:
  name: appbackend
  annotations:
    external-dns.alpha.kubernetes.io/target: <different-dns>
spec:
...

But according to: https://kubernetes-sigs.github.io/external-dns/v0.14.0/annotations/annotations/, The annotation must be on the Gateway, so it won't work on HTTPRoute.

How can I achieve my goal with gateway-api, and override one specific host?

and if I can't, how can I exclude this host that a cloudflare dns records will not be created to specific hostnames?

Rez0k avatar Nov 19 '23 16:11 Rez0k

When I added: - --source=gateway-httproute to the argument list of external-dns:v0.14.0,

it crashes: time="2023-11-22T20:49:42Z" level=fatal msg="failed to sync *v1beta1.Gateway: context deadline exceeded"

I am running Gateway API v1: apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute

Works like champ with just istio-virtualservice without istio-gateway!

UPDATE: My issue might be a RBAC issue. I only have this:

- apiGroups:
  - extensions
  - networking.k8s.io
  resources:
  - ingresses
  verbs:
  - get
  - watch
  - list

Docs say: https://github.com/kubernetes-sigs/external-dns/blob/master/docs/tutorials/gateway-api.md

- apiGroups: [""]
  resources: ["namespaces"]
  verbs: ["get","watch","list"]
- apiGroups: ["gateway.networking.k8s.io"]
  resources: ["gateways","httproutes","grpcroutes","tlsroutes","tcproutes","udproutes"] 
  verbs: ["get","watch","list"]

So, I mod'd my development environment's ClusterRole for external-dns to add the additional permissions from the tutorial page. Now working for gateway-httproute! When I added back in my prior configuration - --source=istio-virtualservice to have both present, External DNS Crashes: time="2023-11-22T21:30:09Z" level=fatal msg="failed to sync *v1alpha3.VirtualService: context deadline exceeded" which it did not do prior. Adding in these per https://github.com/kubernetes-sigs/external-dns/blob/master/docs/tutorials/istio.md

- apiGroups: ["networking.istio.io"]
  resources: ["gateways", "virtualservices"]
  verbs: ["get","watch","list"]

The full configuration is working correctly.

Will these changes be in the ClusterRole in v0.15.0?

wiceywkus avatar Nov 22 '23 20:11 wiceywkus

I am facing a similar issue but could not yet resolve it. Using external-dns with a standard Ingress works without issues, but switching to Gateway API provokes the same error as the TO.

I have provisioned a Gateway and a HTTPRoute. My CNI is cilium. I am not 100% sure which API version it is, but I used v1beta1 before and when I saw the TO used v1 I changed my files accordingly:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: main-gateway
  labels:
    specialPurposeService: "true"
    mainGateway: "true"
spec:
  gatewayClassName: cilium
  listeners:
  - protocol: HTTP
    port: 80
    name: main-gateway
    allowedRoutes:
      namespaces:
        from: All
  - protocol: TLS
    tls:
      mode: Terminate
      certificateRefs:
      - kind: secret
        name: wildcard-tls
    port: 443
    name: main-gateway-tls
    allowedRoutes:
      namespaces:
        from: All
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: cyberchef-https-route
spec:
  parentRefs:
  - name: main-gateway
  hostnames:
  - "gw-cyberchef.k8s.home.xxxx.xxxx"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /
    backendRefs:
    - name: cyberchef-svc
      port: 80

and I have configured my external-dns as follows:

apiVersion: v1
kind: Namespace
metadata:
  name: external-dns
  labels:
    name: external-dns
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: external-dns
  namespace: external-dns
rules:
- apiGroups:
  - ""
  resources:
  - services
  - endpoints
  - pods
  - nodes
  - namespaces
  verbs:
  - get
  - watch
  - list
- apiGroups:
  - extensions
  - networking.k8s.io
  resources:
  - ingresses
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - gateway.networking.k8s.io
  resources:
  - httproute
  - gateway
  verbs:
  - get
  - list
  - watch
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: external-dns
  namespace: external-dns
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: external-dns-viewer
  namespace: external-dns
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: external-dns
subjects:
- kind: ServiceAccount
  name: external-dns
  namespace: external-dns
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: external-dns
  namespace: external-dns
spec:
  selector:
    matchLabels:
      app: external-dns
  template:
    metadata:
      labels:
        app: external-dns
    spec:
      serviceAccountName: external-dns
      containers:
      - name: external-dns
        image: registry.k8s.io/external-dns/external-dns:v0.14.0
        args:
        - --registry=txt
        - --txt-prefix=external-dns-
        - --txt-owner-id=k8s
        - --provider=rfc2136
        - --rfc2136-host=192.168.0.30
        - --rfc2136-port=53
        - --rfc2136-zone=k8s.home.xxxx.xxxxx
        - --rfc2136-tsig-axfr
        - --rfc2136-insecure
        - --source=gateway-httproute
        - --domain-filter=k8s.home.xxxx.xxxxxx

Launching external-dns I get:

time="2023-11-26T14:28:05Z" level=info msg="config: {APIServerURL: KubeConfig: RequestTimeout:30s DefaultTargets:[] GlooNamespaces:[gloo-system] SkipperRouteGroupVersion:zalando.org/v1 Sources:[gateway-httproute] Namespace: AnnotationFilter: LabelFilter: IngressClassNames:[] FQDNTemplate: CombineFQDNAndAnnotation:false IgnoreHostnameAnnotation:false IgnoreIngressTLSSpec:false IgnoreIngressRulesSpec:false GatewayNamespace: GatewayLabelFilter: Compatibility: PublishInternal:false PublishHostIP:false AlwaysPublishNotReadyAddresses:false ConnectorSourceServer:localhost:8080 Provider:rfc2136 GoogleProject: GoogleBatchChangeSize:1000 GoogleBatchChangeInterval:1s GoogleZoneVisibility: DomainFilter:[k8s.home.xxxx.xxxxx] ExcludeDomains:[] RegexDomainFilter: RegexDomainExclusion: ZoneNameFilter:[] ZoneIDFilter:[] TargetNetFilter:[] ExcludeTargetNets:[] AlibabaCloudConfigFile:/etc/kubernetes/alibaba-cloud.json AlibabaCloudZoneType: AWSZoneType: AWSZoneTagFilter:[] AWSAssumeRole: AWSAssumeRoleExternalID: AWSBatchChangeSize:1000 AWSBatchChangeInterval:1s AWSEvaluateTargetHealth:true AWSAPIRetries:3 AWSPreferCNAME:false AWSZoneCacheDuration:0s AWSSDServiceCleanup:false AWSDynamoDBRegion: AWSDynamoDBTable:external-dns AzureConfigFile:/etc/kubernetes/azure.json AzureResourceGroup: AzureSubscriptionID: AzureUserAssignedIdentityClientID: BluecatDNSConfiguration: BluecatConfigFile:/etc/kubernetes/bluecat.json BluecatDNSView: BluecatGatewayHost: BluecatRootZone: BluecatDNSServerName: BluecatDNSDeployType:no-deploy BluecatSkipTLSVerify:false CloudflareProxied:false CloudflareDNSRecordsPerPage:100 CoreDNSPrefix:/skydns/ RcodezeroTXTEncrypt:false AkamaiServiceConsumerDomain: AkamaiClientToken: AkamaiClientSecret: AkamaiAccessToken: AkamaiEdgercPath: AkamaiEdgercSection: InfobloxGridHost: InfobloxWapiPort:443 InfobloxWapiUsername:admin InfobloxWapiPassword: InfobloxWapiVersion:2.3.1 InfobloxSSLVerify:true InfobloxView: InfobloxMaxResults:0 InfobloxFQDNRegEx: InfobloxNameRegEx: InfobloxCreatePTR:false InfobloxCacheDuration:0 DynCustomerName: DynUsername: DynPassword: DynMinTTLSeconds:0 OCIConfigFile:/etc/kubernetes/oci.yaml OCICompartmentOCID: OCIAuthInstancePrincipal:false InMemoryZones:[] OVHEndpoint:ovh-eu OVHApiRateLimit:20 PDNSServer:http://localhost:8081 PDNSAPIKey: PDNSSkipTLSVerify:false TLSCA: TLSClientCert: TLSClientCertKey: Policy:sync Registry:txt TXTOwnerID:k8s TXTPrefix:external-dns- TXTSuffix: TXTEncryptEnabled:false TXTEncryptAESKey: Interval:1m0s MinEventSyncInterval:5s Once:false DryRun:false UpdateEvents:false LogFormat:text MetricsAddress::7979 LogLevel:info TXTCacheInterval:0s TXTWildcardReplacement: ExoscaleEndpoint: ExoscaleAPIKey: ExoscaleAPISecret: ExoscaleAPIEnvironment:api ExoscaleAPIZone:ch-gva-2 CRDSourceAPIVersion:externaldns.k8s.io/v1alpha1 CRDSourceKind:DNSEndpoint ServiceTypeFilter:[] CFAPIEndpoint: CFUsername: CFPassword: ResolveServiceLoadBalancerHostname:false RFC2136Host:192.168.0.30 RFC2136Port:53 RFC2136Zone:k8s.home.xxxx.xxxxx RFC2136Insecure:true RFC2136GSSTSIG:false RFC2136KerberosRealm: RFC2136KerberosUsername: RFC2136KerberosPassword: RFC2136TSIGKeyName: RFC2136TSIGSecret: RFC2136TSIGSecretAlg: RFC2136TAXFR:true RFC2136MinTTL:0s RFC2136BatchChangeSize:50 NS1Endpoint: NS1IgnoreSSL:false NS1MinTTLSeconds:0 TransIPAccountName: TransIPPrivateKeyFile: DigitalOceanAPIPageSize:50 ManagedDNSRecordTypes:[A AAAA CNAME] ExcludeDNSRecordTypes:[] GoDaddyAPIKey: GoDaddySecretKey: GoDaddyTTL:0 GoDaddyOTE:false OCPRouterName: IBMCloudProxied:false IBMCloudConfigFile:/etc/kubernetes/ibmcloud.json TencentCloudConfigFile:/etc/kubernetes/tencent-cloud.json TencentCloudZoneType: PiholeServer: PiholePassword: PiholeTLSInsecureSkipVerify:false PluralCluster: PluralProvider: WebhookProviderURL:http://localhost:8888 WebhookProviderReadTimeout:5s WebhookProviderWriteTimeout:10s WebhookServer:false}"
time="2023-11-26T14:28:05Z" level=info msg="Using inCluster-config based on serviceaccount-token"
time="2023-11-26T14:28:05Z" level=info msg="Created GatewayAPI client https://10.96.0.1:443"
time="2023-11-26T14:28:05Z" level=info msg="Instantiating new Kubernetes client"
time="2023-11-26T14:28:05Z" level=info msg="Using inCluster-config based on serviceaccount-token"
time="2023-11-26T14:28:05Z" level=info msg="Created Kubernetes client https://10.96.0.1:443"
time="2023-11-26T14:29:05Z" level=fatal msg="failed to sync *v1beta1.Gateway: context deadline exceeded"

As far as I understand external-dns seems to be expecting a v1beta1 API? I got the same error before my changes to v1. What could be the reason it cannot query my Gateway and HTTPRoute resources? How can I further troubleshoot this? Thanks!

OevreFlataeker avatar Nov 26 '23 14:11 OevreFlataeker

Here are the permissions:

kubectl auth can-i --list --as=system:serviceaccount:external-dns:external-dns
Resources                                       Non-Resource URLs                      Resource Names   Verbs
selfsubjectreviews.authentication.k8s.io        []                                     []               [create]
selfsubjectaccessreviews.authorization.k8s.io   []                                     []               [create]
selfsubjectrulesreviews.authorization.k8s.io    []                                     []               [create]
ingresses.extensions                            []                                     []               [get list watch]
gateway.gateway.networking.k8s.io               []                                     []               [get list watch]
httproute.gateway.networking.k8s.io             []                                     []               [get list watch]
ingresses.networking.k8s.io                     []                                     []               [get list watch]
endpoints                                       []                                     []               [get watch list]
namespaces                                      []                                     []               [get watch list]
nodes                                           []                                     []               [get watch list]
pods                                            []                                     []               [get watch list]
services                                        []                                     []               [get watch list]
                                                [/.well-known/openid-configuration/]   []               [get]
                                                [/.well-known/openid-configuration]    []               [get]
                                                [/api/*]                               []               [get]
                                                [/api]                                 []               [get]
                                                [/apis/*]                              []               [get]
                                                [/apis]                                []               [get]
                                                [/healthz]                             []               [get]
                                                [/healthz]                             []               [get]
                                                [/livez]                               []               [get]
                                                [/livez]                               []               [get]
                                                [/openapi/*]                           []               [get]
                                                [/openapi]                             []               [get]
                                                [/openid/v1/jwks/]                     []               [get]
                                                [/openid/v1/jwks]                      []               [get]
                                                [/readyz]                              []               [get]
                                                [/readyz]                              []               [get]
                                                [/version/]                            []               [get]
                                                [/version/]                            []               [get]
                                                [/version]                             []               [get]
                                                [/version]                             []               [get]

OevreFlataeker avatar Nov 26 '23 14:11 OevreFlataeker

failed to sync *v1beta1.Gateway: context deadline exceeded I fixed this with my ClusterRole changes. I think the time out you are seeing actually the Kubernetes API request failing and the client side (External DNS) misreporting the request failure as a timeout. The API version may be just External DNS guessing wrong about the available apiVersions. See https://github.com/kubernetes-sigs/external-dns/blob/master/docs/tutorials/gateway-api.md. Notice on the Gateway API tutorial page, it also includes the Role to list/read/watch Namespaces.

wiceywkus avatar Nov 27 '23 14:11 wiceywkus

Well are my ClusterRole permissions not correct? I also set the permissions for the namespace resource. Do you see anything missing? I also assumed it might be about missing permissions, but wouldn't I then see a 401 somewhere? The Kubernetes audit logs should show this, right? I had a look at the external-dns sources as well to see if I could narrow in on the error cause but I have to admit Go is not a language I am very fluent in ;-)

OevreFlataeker avatar Nov 27 '23 14:11 OevreFlataeker

Oh!!! My bad!!!! I checked my config again and only then noticed that I had the singular variant of the resource in my yaml and not the plural version:

gateway <-> gateways, httproute <-> httproutes

I can confirm it started working right away after that change! Tested with K8S 1.28, Cilium 1.14 and MS Windows 2022 DNS

apiVersion: v1
kind: Namespace
metadata:
  name: external-dns
  labels:
    name: external-dns
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: external-dns
  namespace: external-dns
rules:
- apiGroups:
  - ""
  resources:
  - services
  - endpoints
  - pods
  - nodes
  - namespaces
  verbs:
  - get
  - watch
  - list
- apiGroups:
  - extensions
  - networking.k8s.io
  resources:
  - ingresses
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - gateway.networking.k8s.io
  resources:
  - httproutes
  - gateways
  - gprcroutes
  - tlsroutes
  - tcproutes
  - udproutes
  verbs:
  - get
  - list
  - watch
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: external-dns
  namespace: external-dns
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: external-dns-viewer
  namespace: external-dns
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: external-dns
subjects:
- kind: ServiceAccount
  name: external-dns
  namespace: external-dns
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: external-dns
  namespace: external-dns
spec:
  selector:
    matchLabels:
      app: external-dns
  template:
    metadata:
      labels:
        app: external-dns
    spec:
      serviceAccountName: external-dns
      containers:
      - name: external-dns
        image: registry.k8s.io/external-dns/external-dns:v0.14.0
        args:
        - --registry=txt
        - --txt-prefix=external-dns-
        - --txt-owner-id=k8s
        - --provider=rfc2136
        - --rfc2136-host=192.168.0.30
        - --rfc2136-port=53
        - --rfc2136-zone=k8s.home.xxx.xxx
        - --rfc2136-tsig-axfr
        - --rfc2136-insecure
        - --source=gateway-httproute
        - --domain-filter=k8s.home.xxx.xxx

OevreFlataeker avatar Nov 27 '23 15:11 OevreFlataeker

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Feb 25 '24 15:02 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Mar 26 '24 16:03 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Apr 25 '24 17:04 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Apr 25 '24 17:04 k8s-ci-robot