traefik icon indicating copy to clipboard operation
traefik copied to clipboard

IngressRouteUDP doesn't seem to work

Open blanco750 opened this issue 6 months ago • 1 comments

Welcome!

  • [X] Yes, I've searched similar issues on GitHub and didn't find any.
  • [X] Yes, I've searched similar issues on the Traefik community forum and didn't find any.

What did you do?

We have traefik deployed using helm and flux. Traefik app version is v3.0.2 Scenario: An edge router needs to sync time with public NTP service and we are trying to use Traefik IngressRouteUDP to route the traffic to a public time server. The path looks like this. EC2 A(Mimicking edge router for testing) -> NLB (Traefik)—> IngressRouteUDP -> ExternalName > public time server

traefikIngressrouteUDP

Traefik controllers are registered as healthy with aws NLB on all defined ports including port 123. I am using port 123 with privilege mode just to make sure I don't have to do port mapping for POC I am doing.

What did you see instead?

EC2/Edge router can't sync the time using chrony to the public time server using IngressRouteUDP. I tested another scenario where I added a chrony container in between to take TCP dump but I don't see /Traefik IngressRouteUDP forwarding any NTP 123 traffic to the chrony container wxcept seeing some braodcast messages related to NTPv1 EC2 A -> NLB(Traefik) —> IngressRouteUDP -> (crony server container service)Chrony server container > public time server

What version of Traefik are you using?

App Version: v3.0.2 Chart Name: traefik Chart Version: 28.3.0

What is your environment & configuration?

# ingressrouteudp.yaml
apiVersion: traefik.io/v1alpha1
kind: IngressRouteUDP
metadata:
  name: ntp-route
  namespace: platform-ingress
  annotations:
    kubernetes.io/ingress.class: "platform"
spec:
  entryPoints:
    - ntp1
  routes:
    - services:
        - name: external-ntp
          port: 123

ExternalName service

apiVersion: v1
kind: Service
metadata:
  name: external-ntp
  namespace: platform-ingress
  annotations:
    kubernetes.io/ingress.class: "platform"
spec:
  type: ExternalName
  externalName: time.google.com
  ports:
    - port: 123
      protocol: UDP
      targetPort: 123

Base helm-realse file is below

---
apiVersion: helm.toolkit.fluxcd.io/v2beta2
kind: HelmRelease
metadata:
  name: traefik
  namespace: flux-system
spec:
  interval: 30m
  chart:
    spec:
      chart: traefik
      version: "${traefik_chart_version:=v28.3.0}"
      sourceRef:
        kind: HelmRepository
        name: traefik
        namespace: flux-system
      interval: 12h
  releaseName: traefik
  targetNamespace: ${namespace_name:=default}
  install:
    crds: Create
  upgrade:
    crds: CreateReplace
  values:
    tolerations:
    - key: system-component
      operator: Equal
      value: "true"
      effect: NoSchedule

    affinity:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: system-component
              operator: In
              values:
              - "true"

    podDisruptionBudget:
      enabled: true
      maxUnavailable: 1

    updateStrategy:
      type: RollingUpdate
      rollingUpdate:
        maxSurge: 1
        maxUnavailable: 0

    topologySpreadConstraints:
    - labelSelector:
        matchLabels:
          app.kubernetes.io/name: '{{ template "traefik.name" . }}'
          app.kubernetes.io/instance: '{{ .Release.Name }}-{{ .Release.Namespace }}'
      topologyKey: topology.kubernetes.io/zone
      maxSkew: 1
      whenUnsatisfiable: ScheduleAnyway
    - labelSelector:
        matchLabels:
          app.kubernetes.io/name: '{{ template "traefik.name" . }}'
          app.kubernetes.io/instance: '{{ .Release.Name }}-{{ .Release.Namespace }}'
      topologyKey: kubernetes.io/hostname
      maxSkew: 1
      whenUnsatisfiable: ScheduleAnyway

    deployment:
      replicas: 2

    providers:
      kubernetesIngress:
        ingressClass: platform
        publishedService:
          enabled: true

    # Create an IngressRoute for the dashboard
    ingressRoute:
      dashboard:
        annotations:
          external-dns.alpha.kubernetes.io/target: platform-ingress.${domain_name:=example.com}
        enabled: true
        # Custom match rule with host domain
        matchRule: Host(`traefik.${domain_name:=example.com}`)
        entryPoints: ["websecure"]
        # Add custom middlewares : authentication and redirection
        middlewares:
        - name: traefik-dashboard-auth

    extraObjects:
    - apiVersion: traefik.io/v1alpha1
      kind: Middleware
      metadata:
        name: traefik-dashboard-auth
      spec:
        basicAuth:
          secret: traefik-dashboard-auth-secret

    ingressClass:
      enabled: true
      name: platform
      isDefaultClass: false

    logs:
      access:
        enabled: true
        format: json
        fields:
          headers:
            defaultMode: keep
            names:
              "X-Forwarded-For": keep
              "X-Real-IP": keep

Additional values are under patches/helm-release.yaml

---
apiVersion: helm.toolkit.fluxcd.io/v2beta2
kind: HelmRelease
metadata:
  name: traefik
  namespace: flux-system
spec:
  values:
    securityContext:
      allowPrivilegeEscalation: true
      capabilities:
        add: [NET_BIND_SERVICE]
      runAsGroup: 0
      runAsNonRoot: false
      runAsUser: 0

    podSecurityContext:
      runAsUser: 0
      runAsNonRoot: false
      runAsGroup: 0
    #######
    ports:
      web:
        redirectTo:
          port: websecure
        proxyProtocol:
          trustedIPs:
            - <ip range >
        forwardedHeaders:
          trustedIPs:
            - <ip range >

      dns:
        protocol: UDP
        port: 53 # External port exposed by NLB
        expose:
          default: true
        exposedPort: 53 # Internal port used by Traefik
        containerPort: 53 # Port inside the container where Traefik listens
      ntp1:
        protocol: UDP
        port: 123 # External port exposed by NLB
        expose:
          default: true
        exposedPort: 123 # Internal port used by Traefik
        containerPort: 123 # Port inside the container where Traefik listens
      tcp1:
        protocol: TCP
        port: 9010 # External port exposed by NLB
        expose:
          default: true
        exposedPort: 9010 # Internal port used by Traefik
        containerPort: 9010 # Port inside the container where Traefik listens
      websecure:
        # Disable TLS termination at Traefik - this is terminated at the NLB
        tls:
          enabled: false

        proxyProtocol:
          trustedIPs:
            - <ip range >
        forwardedHeaders:
          trustedIPs:
            - <ip range >
    deployment:
      additionalContainers:
        - name: tcp-health-check
          image: busybox:latest
          args:
            [
              "sh",
              "-c",
              "while true; do echo -e 'HTTP/1.1 200 OK\r\n\r\n' | nc -l -p 9000; done",
            ]
          ports:
            - containerPort: 9000
              name: tcp-health
              protocol: TCP
          livenessProbe:
            tcpSocket:
              port: 9000
            initialDelaySeconds: 5
            periodSeconds: 10
          readinessProbe:
            tcpSocket:
              port: 9000
            initialDelaySeconds: 5
            periodSeconds: 10
    providers:
      kubernetesCRD:
        # -- Load Kubernetes IngressRoute provider
        enabled: true
        # -- Allows IngressRoute to reference resources in namespace other than theirs
        allowCrossNamespace: true
        # -- Allows to reference ExternalName services in IngressRoute
        allowExternalNameServices: true
        ingressClass: platform
    logs:
      general:
        # -- Set [logs format](https://doc.traefik.io/traefik/observability/logs/#format)
        # @default common
        format:
        # By default, the level is set to INFO.
        # -- Alternative logging levels are DEBUG, PANIC, FATAL, ERROR, WARN, and INFO.
        level: DEBUG
    service:
      annotations:
        external-dns.alpha.kubernetes.io/hostname: platform-ingress.${domain_name:=example.com}

        # service.beta.kubernetes.io/aws-load-balancer-backend-protocol: http
        service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "60"
        service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"

        # Use ACM certificate for the NLB - this terminates SSL at the NLB on port 443
        service.beta.kubernetes.io/aws-load-balancer-ssl-cert: ${ingress_ssl_certificate_arn}
        service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "443"

        # Use NLB with IP target type - this routes traffic directly to the pod IP
        service.beta.kubernetes.io/aws-load-balancer-type: nlb-ip
        service.beta.kubernetes.io/aws-load-balancer-internal: "true"
        service.beta.kubernetes.io/aws-load-balancer-name: ${cluster_name}

        # Enable proxy protocol on the NLB
        service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: "*"

        service.beta.kubernetes.io/aws-load-balancer-healthcheck-port: "9000" # Port for health checks
        # service.beta.kubernetes.io/aws-load-balancer-healthcheck-protocol: "TCP"

      externalTrafficPolicy: Local
      loadBalancerSourceRanges:
        - <ip range >
        - <ip range >

If applicable, please paste the log output in DEBUG level

Controller Logs Here are some logs related to ExternalName service and ingressRoute

2024-08-08T09:17:15Z DBG github.com/traefik/traefik/v3/pkg/provider/aggregator/aggregator.go:203 > *crd.Provider provider configuration config={"allowExternalNameServices":true} 2024-08-08T09:17:15Z WRN github.com/traefik/traefik/v3/pkg/provider/kubernetes/crd/kubernetes.go:139 > ExternalName service loading is enabled, please ensure that this is expected (see AllowExternalNameServices option) providerName=kubernetescrd

2024-08-08T21:19:11Z DBG github.com/traefik/traefik/v3/pkg/server/service/udp/service.go:71 > Creating UDP server entryPointName=ntp1 routerName=platform-ingress-ntp-route-0@kubernetescrd serverAddress=time.google.com:123 serverIndex=0 serviceName=platform-ingress-ntp-route-0@kubernetescrd

Controller Logs show that it is streaming UDP traffic from EC2 to the public time server but as mentioned Time doesn't get synced on EC2. 172.16.171.179 is the EC2 IP(acting as edge router)

2024-08-08T21:20:04Z DBG github.com/traefik/traefik/v3/pkg/udp/proxy.go:23 > Handling UDP stream from 172.16.171.179:46870 to time.google.com:123

2024-08-08T21:24:45Z DBG github.com/traefik/traefik/v3/pkg/udp/proxy.go:23 > Handling UDP stream from 172.16.171.179:60278 to time.google.com:123

But the problem is EC2 can't sync time with public NTP server and when another chrony container(chrony server) is used after Traefik/IngressRouteUDP to take TCP dump I don't see expected traffic except this

6:40.500535 eth0 In IP 172-16-169-145.traefik.platform-ingress.svc.cluster.local.40195 > chrony-server-95bfb4c69-dwcpt.123: NTPv1, Broadcast, length 148

172-16-169-145 is Traefik controller

blanco750 avatar Aug 12 '24 09:08 blanco750