kilo icon indicating copy to clipboard operation
kilo copied to clipboard

Kilo ignores --port argument?

Open absolutejam opened this issue 1 year ago • 3 comments

Hey!

I'm looking to use Kilo to connect a couple of clusters and I'm testing it in a Proxmox environment using Talos linux at the moment.

I've deployed it to 2 separate clusters, but on both I see the error address already in use in the logs.

I've also tried exec'ing into a node debug container and bringing up the kilo0 interface and I get the same error (I assumed as much as the error points to netlink doing the same).

Originally I thought this was because the nodes were all seemingly getting the same IP address in the Kilo subnet, but I've altered the DaemonSet so that it just schedules onto a single node now.

# ip addr | grep kilo0
38: kilo0: <POINTOPOINT,NOARP> mtu 1420 qdisc noop state DOWN qlen 1000 
    inet 172.35.254.1/24 brd 172.35.254.255 scope global kilo0
# wg show (excerpt)
interface: kilo0
  public key: iSbFLicHr0iTRs3rr/SbTRArGrkr2gnNJYuVd0AiIyI=
  private key: (hidden)
  listening port: 51820

peer: FIjoCB8sdPnsC3RI8tDBcRFucKsPF+3IX3J27h3OzHQ=
  endpoint: 192.168.0.111:51820
  allowed ips: 10.1.2.21/32, 192.168.0.0/24, 10.1.2.0/24, 10.102.0.0/17, 10.102.128.0/17, 172.35.254.0/24

For reference, I'm running Talos Linux (With Kubespan enabled) and Cilium.

Node public IP range (both clusters): 192.168.0.1/24 This cluster's internal IP range: 10.1.5.0/24 This cluster's Pod range: 10.105.128.0/17 This cluster's Service range: 10.105.0.0/17

Here's how I'm running Kilo in the DaemonSet:

  - args:
    - --kubeconfig=/var/kubeconfig/kubeconfig
    - --hostname=$(NODE_NAME)
    - --cni=false
    - --compatibility=cilium
    - --local=false
    - --encapsulate=crosssubnet
    - --clean-up-interface=true
    - --subnet=172.35.254.0/24
    - --log-level=all
    env:
    - name: NODE_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.nodeName
    image: squat/kilo:0.6.0

Here's a larger chunk of the logs from Kilo:

kilo {"caller":"mesh.go:299","component":"kilo","event":"update","level":"debug","msg":"syncing nodes","ts":"2024-12-11T09:21:49.439356661Z"}
kilo {"caller":"mesh.go:301","component":"kilo","event":"update","level":"debug","msg":"processing local node","node":{"Endpoint":{},"Key":[137,38,197,46,39,7,175,72,147,70,205,235,175,244,155,77,16,43,26,185,43,218,9,205,37,139,149,119,64,34,35,34],"NoInternalIP":false,"InternalIP":{"IP":"192.168.0.111","Mask":"////AA=="},"LastSeen":1733908909,"Leader":false,"Location":"mgt","N
ame":"mgt-app-01","PersistentKeepalive":0,"Subnet":{"IP":"10.105.130.0","Mask":"////AA=="},"WireGuardIP":{"IP":"172.35.254.1","Mask":"////AA=="},"DiscoveredEndpoints":{},"AllowedLocationIPs":null,"Granularity":"location"},"ts":"2024-12-11T09:21:49.439434741Z"}
kilo {"caller":"mesh.go:299","component":"kilo","event":"update","level":"debug","msg":"syncing nodes","ts":"2024-12-11T09:21:49.664372525Z"}
kilo {"caller":"mesh.go:301","component":"kilo","event":"update","level":"debug","msg":"processing local node","node":{"Endpoint":{},"Key":[137,38,197,46,39,7,175,72,147,70,205,235,175,244,155,77,16,43,26,185,43,218,9,205,37,139,149,119,64,34,35,34],"NoInternalIP":false,"InternalIP":{"IP":"192.168.0.111","Mask":"////AA=="},"LastSeen":1733908909,"Leader":false,"Location":"mgt","N
ame":"mgt-app-01","PersistentKeepalive":0,"Subnet":{"IP":"10.105.130.0","Mask":"////AA=="},"WireGuardIP":{"IP":"172.35.254.1","Mask":"////AA=="},"DiscoveredEndpoints":{},"AllowedLocationIPs":null,"Granularity":"location"},"ts":"2024-12-11T09:21:49.66530894Z"}
kilo {"caller":"mesh.go:299","component":"kilo","event":"update","level":"debug","msg":"syncing nodes","ts":"2024-12-11T09:21:51.248264788Z"}
kilo {"caller":"mesh.go:310","component":"kilo","event":"update","in-mesh":true,"level":"debug","msg":"received non ready node","node":{"Endpoint":null,"Key":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"NoInternalIP":false,"InternalIP":null,"LastSeen":0,"Leader":false,"Location":"mgt","Name":"mgt-app-03","PersistentKeepalive":0,"Subnet":{"IP":"10.105.128.0"
,"Mask":"////AA=="},"WireGuardIP":null,"DiscoveredEndpoints":null,"AllowedLocationIPs":null,"Granularity":""},"ts":"2024-12-11T09:21:51.248476592Z"}
kilo {"caller":"mesh.go:328","component":"kilo","event":"update","level":"info","node":{"Endpoint":null,"Key":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"NoInternalIP":false,"InternalIP":null,"LastSeen":0,"Leader":false,"Location":"mgt","Name":"mgt-app-03","PersistentKeepalive":0,"Subnet":{"IP":"10.105.128.0","Mask":"////AA=="},"WireGuardIP":null,"Discover
edEndpoints":null,"AllowedLocationIPs":null,"Granularity":""},"ts":"2024-12-11T09:21:51.248860346Z"}
kilo {"DiscoveredEndpoints":{},"caller":"mesh.go:830","component":"kilo","level":"debug","msg":"Discovered WireGuard NAT Endpoints","ts":"2024-12-11T09:21:51.24960647Z"}
kilo {"allowedIPs":[{"IP":"10.105.130.0","Mask":"////AA=="},{"IP":"192.168.0.111","Mask":"/////w=="}],"allowedLocationIPs":null,"caller":"topology.go:196","cidrs":[{"IP":"10.105.130.0","Mask":"////AA=="}],"component":"kilo","endpoint":"192.168.0.111:51820","hostnames":["mgt-app-01"],"leader":0,"level":"debug","location":"location:mgt","msg":"generated segment","privateIPs":["192
.168.0.111"],"ts":"2024-12-11T09:21:51.249831188Z"}
kilo {"caller":"topology.go:244","component":"kilo","hostname":"mgt-app-01","leader":true,"level":"debug","location":"location:mgt","msg":"generated topology","privateIP":"192.168.0.111/24","subnet":"10.105.130.0/24","ts":"2024-12-11T09:21:51.250158391Z","wireGuardIP":"172.35.254.1/24"}
kilo {"caller":"mesh.go:575","component":"kilo","error":"address already in use","level":"error","ts":"2024-12-11T09:21:51.256121968Z"}
kilo {"DiscoveredEndpoints":{},"caller":"mesh.go:830","component":"kilo","level":"debug","msg":"Discovered WireGuard NAT Endpoints","ts":"2024-12-11T09:22:19.336998644Z"}
kilo {"allowedIPs":[{"IP":"10.105.130.0","Mask":"////AA=="},{"IP":"192.168.0.111","Mask":"/////w=="}],"allowedLocationIPs":null,"caller":"topology.go:196","cidrs":[{"IP":"10.105.130.0","Mask":"////AA=="}],"component":"kilo","endpoint":"192.168.0.111:51820","hostnames":["mgt-app-01"],"leader":0,"level":"debug","location":"location:mgt","msg":"generated segment","privateIPs":["192
.168.0.111"],"ts":"2024-12-11T09:22:19.337466339Z"}
kilo {"caller":"topology.go:244","component":"kilo","hostname":"mgt-app-01","leader":true,"level":"debug","location":"location:mgt","msg":"generated topology","privateIP":"192.168.0.111/24","subnet":"10.105.130.0/24","ts":"2024-12-11T09:22:19.340805821Z","wireGuardIP":"172.35.254.1/24"}
kilo {"caller":"mesh.go:575","component":"kilo","error":"address already in use","level":"error","ts":"2024-12-11T09:22:19.385963133Z"}
kilo {"caller":"mesh.go:387","component":"kilo","level":"debug","msg":"successfully checked in local node in backend","ts":"2024-12-11T09:22:19.509099938Z"}
kilo {"caller":"mesh.go:299","component":"kilo","event":"update","level":"debug","msg":"syncing nodes","ts":"2024-12-11T09:22:19.512658629Z"}
kilo {"caller":"mesh.go:301","component":"kilo","event":"update","level":"debug","msg":"processing local node","node":{"Endpoint":{},"Key":[137,38,197,46,39,7,175,72,147,70,205,235,175,244,155,77,16,43,26,185,43,218,9,205,37,139,149,119,64,34,35,34],"NoInternalIP":false,"InternalIP":{"IP":"192.168.0.111","Mask":"////AA=="},"LastSeen":1733908939,"Leader":false,"Location":"mgt","N
ame":"mgt-app-01","PersistentKeepalive":0,"Subnet":{"IP":"10.105.130.0","Mask":"////AA=="},"WireGuardIP":{"IP":"172.35.254.1","Mask":"////AA=="},"DiscoveredEndpoints":{},"AllowedLocationIPs":null,"Granularity":"location"},"ts":"2024-12-11T09:22:19.513267564Z"}
kilo {"caller":"mesh.go:299","component":"kilo","event":"update","level":"debug","msg":"syncing nodes","ts":"2024-12-11T09:22:22.774649619Z"}
kilo {"caller":"mesh.go:310","component":"kilo","event":"update","in-mesh":true,"level":"debug","msg":"received non ready node","node":{"Endpoint":null,"Key":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"NoInternalIP":false,"InternalIP":null,"LastSeen":0,"Leader":false,"Location":"mgt","Name":"mgt-ctl-01","PersistentKeepalive":0,"Subnet":{"IP":"10.105.131.0"
,"Mask":"////AA=="},"WireGuardIP":null,"DiscoveredEndpoints":null,"AllowedLocationIPs":null,"Granularity":""},"ts":"2024-12-11T09:22:22.774741068Z"}
kilo {"caller":"mesh.go:328","component":"kilo","event":"update","level":"info","node":{"Endpoint":null,"Key":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"NoInternalIP":false,"InternalIP":null,"LastSeen":0,"Leader":false,"Location":"mgt","Name":"mgt-ctl-01","PersistentKeepalive":0,"Subnet":{"IP":"10.105.131.0","Mask":"////AA=="},"WireGuardIP":null,"Discover
edEndpoints":null,"AllowedLocationIPs":null,"Granularity":""},"ts":"2024-12-11T09:22:22.774813968Z"}
kilo {"DiscoveredEndpoints":{},"caller":"mesh.go:830","component":"kilo","level":"debug","msg":"Discovered WireGuard NAT Endpoints","ts":"2024-12-11T09:22:22.775455081Z"}
kilo {"allowedIPs":[{"IP":"10.105.130.0","Mask":"////AA=="},{"IP":"192.168.0.111","Mask":"/////w=="}],"allowedLocationIPs":null,"caller":"topology.go:196","cidrs":[{"IP":"10.105.130.0","Mask":"////AA=="}],"component":"kilo","endpoint":"192.168.0.111:51820","hostnames":["mgt-app-01"],"leader":0,"level":"debug","location":"location:mgt","msg":"generated segment","privateIPs":["192
.168.0.111"],"ts":"2024-12-11T09:22:22.779806727Z"}
kilo {"caller":"topology.go:244","component":"kilo","hostname":"mgt-app-01","leader":true,"level":"debug","location":"location:mgt","msg":"generated topology","privateIP":"192.168.0.111/24","subnet":"10.105.130.0/24","ts":"2024-12-11T09:22:22.779979362Z","wireGuardIP":"172.35.254.1/24"}
kilo {"caller":"mesh.go:575","component":"kilo","error":"address already in use","level":"error","ts":"2024-12-11T09:22:22.794821067Z"}

absolutejam avatar Dec 11 '24 09:12 absolutejam

So I think this is actually due to both Kilo and Kubespan trying to use port 51820.

I've set the --port argument to instead use 51821 in the Kilo DaemonSet and manually removed the existing interface with ip link del kilo0 but it seems to still be trying to use 51820 no matter what 🤔 I've confirmed this with netstat and there's no 51821 being used.

I'm going to try and disable Kubespan to see if I can verify this is the issue, but I'm not sure why Kilo isn't respecting the port argument.

EDIT: Confirmed by removing Kubespan.

absolutejam avatar Dec 12 '24 07:12 absolutejam

I have tested this again today and it seems to have worked without issue now that KubeSpan is disabled.

I will try it with KubeSpan again to see if there's something strange going on.

absolutejam avatar Dec 16 '24 09:12 absolutejam

@absolutejam did you get confirmation? how did you try to set the --port argument? I don't see it in your snippet for the DaemonSet config

squat avatar Feb 14 '25 20:02 squat