charts icon indicating copy to clipboard operation
charts copied to clipboard

DHCP replies primary IP of interface

Open cmdrrobin opened this issue 1 year ago • 2 comments

When configuring smee and stack to use a secondary IP address for loadbalancing, the replied traffic is sent from the primary IP, not with secondary (loadbalancing, red.) ip.

Expected Behaviour

I would expect kube-vip to reply the traffic from the secondary IP

1719998545.533120 xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, ethertype IPv4 (0x0800), length 382: 10.128.112.161.67 > 10.128.161.133.67: BOOTP/DHCP, Request from xx:xx:xx:xx:xx:xx, length 340
1719998545.534029 xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, ethertype IPv4 (0x0800), length 406: 10.128.161.133.435 > 10.128.112.161.67: BOOTP/DHCP, Reply, length 364
1719998545.534114 xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, ethertype IPv4 (0x0800), length 442: 10.128.161.133.435 > 10.128.112.161.67: BOOTP/DHCP, Reply, length 400
1719998545.534205 xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, ethertype IPv4 (0x0800), length 424: 10.128.161.133.435 > 10.128.112.161.67: BOOTP/DHCP, Reply, length 382

Current Behaviour

1719998545.533120 xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, ethertype IPv4 (0x0800), length 382: 10.128.112.161.67 > 10.128.161.133.67: BOOTP/DHCP, Request from xx:xx:xx:xx:xx:xx, length 340
1719998545.534029 xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, ethertype IPv4 (0x0800), length 406: 10.128.161.132.435 > 10.128.112.161.67: BOOTP/DHCP, Reply, length 364
1719998545.534114 xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, ethertype IPv4 (0x0800), length 442: 10.128.161.132.435 > 10.128.112.161.67: BOOTP/DHCP, Reply, length 400
1719998545.534205 xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, ethertype IPv4 (0x0800), length 424: 10.128.161.132.435 > 10.128.112.161.67: BOOTP/DHCP, Reply, length 382

Possible Solution

I have no idea

Steps to Reproduce (for bugs)

  1. Deploy Tinkerbell

    trusted_proxies=$(kubectl get nodes -o jsonpath='{.items[*].spec.podCIDR}' | tr ' ' ',') LB_IP=10.128.161.133 helm install tink-stack charts/tinkerbell/stack --create-namespace --namespace tink-system --wait --set "smee.trustedProxies={${trusted_proxies}}" --set "hegel.trustedProxies={${trusted_proxies}}" --set "stack.loadBalancerIP=$LB_IP" --set "smee.publicIP=$LB_IP"

  2. Request DHCP from a node

  3. watch traffic

Context

Cannot use Tinkerbell service

Your Environment

  • Operating System and version (e.g. Linux, Windows, MacOS): Ubuntu 22.04.4 LTS with K3s version 1.30.0+k3s1

  • How are you running Tinkerbell? Using Vagrant & VirtualBox, Vagrant & Libvirt, on Packet using Terraform, or give details: KVM

  • Link to your project or a code example to reproduce issue: K3s is deployed with default settings

cmdrrobin avatar Jul 03 '24 09:07 cmdrrobin

It looks like this issue is only with UDP traffic, not TCP.

cmdrrobin avatar Jul 05 '24 07:07 cmdrrobin

Hey @rgruyters, thanks for posting this. Mind clarifying a bit. What are the IPs involved? Also, I'm not understanding the affect this is having. Mind expanding on Cannot use Tinkerbell service?

IP Description
10.128.112.161 dhcp client?
10.128.161.133 ?
10.128.161.132 ?

jacobweinstock avatar Jul 09 '24 21:07 jacobweinstock

Sure!

IP Description
10.128.112.161 DHCP client
10.128.161.133 Secondary IP for LoadBalancer to use with Tinkerbell
10.128.161.132 Host IP (where Kubernetes is running

cmdrrobin avatar Aug 23 '24 15:08 cmdrrobin

It is normal Kubernetes behavior for traffic originating from within a pod to be sent out via the Host's IP. As DHCP traffic is UDP and connectionless, all DHCP packets sent by Smee can be classified as originating from within the Smee pod. Furthermore, Kube-vip doesn't create routing rules. If you look at the interface that has the IP configured by kube-vip you'll see that it creates the IP with a /32. This means this IP will not be used for routing when the host's routing table is used.

Is this traffic pattern causing issues of some kind?

jacobweinstock avatar Sep 02 '24 16:09 jacobweinstock

yeah, when DHCP traffic is passed through a relay address, in this case 10.128.112.161, it won't work, because reply traffic comes from a different IP,10.128.161.132, rather than the expected 10.128.161.133.

cmdrrobin avatar Sep 06 '24 18:09 cmdrrobin

yeah, when DHCP traffic is passed through a relay address, in this case 10.128.112.161, it won't work, because reply traffic comes from a different IP,10.128.161.132, rather than the expected 10.128.161.133.

Hey @rgruyters. what do you mean by, "it won't work"? what exactly isn't working? Is there a DHCP relay in use in your environment?

jacobweinstock avatar Sep 12 '24 00:09 jacobweinstock

Yes, we use DHCP relais to pass DHCP requests to our Tinkerbell service.

It won't work, as in the replied traffic from 10.128.161.132 will not be accepted by the relay process, because the initial traffic was sent to .133.

cmdrrobin avatar Sep 12 '24 05:09 cmdrrobin

Mind sharing more info about the dhcp relay you're using? I'm not familiar with this kind of IP filtering. Also, have you tried deploying the stack with stack.relay.presentGiaddrAction: forward?

jacobweinstock avatar Sep 12 '24 15:09 jacobweinstock

Sorry for the late response, we use Cumulus switches with DHCP relay on it. I think they use ISC DHCP service.

Also, have you tried deploying the stack with stack.relay.presentGiaddrAction: forward?

No I haven't. Will look into it. Thanks!

cmdrrobin avatar Sep 20 '24 06:09 cmdrrobin

Hey @rgruyters , thanks for sharing some details on your switches. I see you closed the issue. Was this on purpose? Maybe you were able to resolve the issue?

jacobweinstock avatar Sep 23 '24 20:09 jacobweinstock

I have closed it, because the option to set stack.relay.presentGiaddrAction: forward would work for us. (for dhcrelay would be -m forward option)

cmdrrobin avatar Sep 26 '24 13:09 cmdrrobin

Thanks for the update. Glad to hear that works.

jacobweinstock avatar Sep 26 '24 15:09 jacobweinstock