sig-windows-tools
sig-windows-tools copied to clipboard
no connection to service net in windows pod when useing flannel vxlan (overlay) network
Describe the bug I've problem when installing a Windows node.
- Is there a typo in the documentation? in the Guide for flannel guides/flannel.md was a refferenz to install flannel for WIndows:
controlPlaneEndpoint=$(kubectl get configmap -n kube-system kube-proxy -o jsonpath="{.data['kubeconfig\.conf']}" | grep server: | sed 's/.*\:\/\///g')
kubernetesServiceHost=$(echo $controlPlaneEndpoint | cut -d ":" -f 1)
kubernetesServicePort=$(echo $controlPlaneEndpoint | cut -d ":" -f 2)
curl -L https://raw.githubusercontent.com/kubernetes-sigs/sig-windows-tools/master/hostprocess/flannel/flanneld/flannel-overlay.yml | sed 's/FLANNEL_VERSION/v0.21.5/g' | sed "s/KUBERNETES_SERVICE_HOST_VALUE/$kubernetesServiceHost/g" | sed "s/KUBERNETES_SERVICE_PORT_VALUE/$kubernetesServicePort/g" | kubectl apply -f -
It reffers to Version v0.21.5 but the newes version i could found is Version v0.14.0-hostprocess i changed it to mik4sa/flannel:v0.21.5 but not clear if this is correct. with this change the Proxy and Host Process is up. but unfortunately is fail in the next error no connection to Service LAN
- when i Install the Node and start a pod (see config) i could ping alls Networks expected the Service LAN. so the DNS is not working.
To Reproduce Do on a running Cluster:
$curl -L https://raw.githubusercontent.com/kubernetes-sigs/sig-windows-tools/master/hostprocess/flannel/flanneld/flannel-overlay.yml | sed 's/sigwindowstools\/flannel:FLANNEL_VERSION/mik4sa\/flannel:v0.21.5/g' | kubectl apply -f -
$curl -L https://raw.githubusercontent.com/kubernetes-sigs/sig-windows-tools/master/hostprocess/flannel/kube-proxy/kube-proxy.yml | sed 's/KUBE_PROXY_VERSION/v1.27.3/g' | kubectl apply -f -
$kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/sig-windows-tools/master/hostprocess/flannel/flanneld/kube-flannel-rbac.yml
$ kubectl get pods -n kube-flannel
NAME READY STATUS RESTARTS AGE
kube-flannel-ds-8vvv2 1/1 Running 1 (14d ago) 14d
kube-flannel-ds-94v42 1/1 Running 1 (4d16h ago) 14d
kube-flannel-ds-hhzhk 1/1 Running 0 14d
kube-flannel-ds-windows-amd64-4wkmb 1/1 Running 0 23h
$ kubectl describe pod kube-flannel-ds-windows-amd64-4wkmb -n kube-flannel
Name: kube-flannel-ds-windows-amd64-4wkmb
Namespace: kube-flannel
Priority: 0
Service Account: flannel
Node: k8t-win-node-1/10.10.13.204
Start Time: Fri, 04 Aug 2023 18:37:22 +0200
Labels: app=flannel
controller-revision-hash=64d67796cc
pod-template-generation=8
tier=node
Annotations: <none>
Status: Running
IP: 10.10.13.204
IPs:
IP: 10.10.13.204
Controlled By: DaemonSet/kube-flannel-ds-windows-amd64
Containers:
kube-flannel:
Container ID: containerd://7b86da67e60a8c0d41b0ecdb6523aa84b542dd85f3c7345ec89ab288e44ca331
Image: mik4sa/flannel:v0.21.5-hostprocess
Image ID: docker.io/mik4sa/flannel@sha256:71b187a72810d9da27d304bbe8557487c69e95c60942f43940074e0d8caecf96
Port: <none>
Host Port: <none>
State: Running
Started: Fri, 04 Aug 2023 18:37:24 +0200
Ready: True
Restart Count: 0
Environment:
CNI_BIN_PATH: C:\\opt\\cni\\bin
CNI_CONFIG_PATH: C:\\etc\\cni\\net.d
SERVICE_SUBNET: 10.96.0.0/12
KUBERNETES_SERVICE_HOST: 10.10.13.201
KUBERNETES_SERVICE_PORT: 6443
POD_NAME: kube-flannel-ds-windows-amd64-4wkmb (v1:metadata.name)
POD_NAMESPACE: kube-flannel (v1:metadata.namespace)
Mounts:
/mounts/kube-flannel-windows/ from flannel-windows-cfg (rw)
/mounts/kube-flannel/ from flannel-cfg (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-sdkzs (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
flannel-cfg:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: kube-flannel-cfg
Optional: false
flannel-windows-cfg:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: kube-flannel-windows-cfg
Optional: false
kube-api-access-sdkzs:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: :NoSchedule op=Exists
:NoExecute op=Exists
CriticalAddonsOnly op=Exists
node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/network-unavailable:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events: <none>
Uses this Testpod Config:
$ cat winTest.yaml
# windows-pod-with-pvc.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: test-claim
spec:
storageClassName: synology-iscsi-win
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1G
---
apiVersion: v1
kind: Pod
metadata:
name: my-windows-pod
spec:
containers:
- name: windows-server-container
image: mcr.microsoft.com/windows/servercore:ltsc2019
command:
- powershell.exe
args:
- "-NoLogo"
- "-Command"
- "while ($true) { Write-Host 'Hello from Windows Server 2019'; Start-Sleep -Seconds 5 }"
volumeMounts:
- name: my-pvc-volume
mountPath: "D:"
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- windows
volumes:
- name: my-pvc-volume
persistentVolumeClaim:
claimName: test-claim
and make this Tests
$ kubectl get pods -n kube-system -o=wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-f47c568f5-l4twx 1/1 Running 0 7d 10.244.0.20 k8t-master-1 <none> <none>
coredns-f47c568f5-wzpcv 1/1 Running 0 7d 10.244.1.31 k8t-node-1 <none> <none>
etcd-k8t-master-1 1/1 Running 2 (14d ago) 33d 10.10.13.201 k8t-master-1 <none> <none>
kube-apiserver-k8t-master-1 1/1 Running 368 14d 10.10.13.201 k8t-master-1 <none> <none>
kube-controller-manager-k8t-master-1 1/1 Running 3 (14d ago) 33d 10.10.13.201 k8t-master-1 <none> <none>
kube-proxy-4zzsq 1/1 Running 1 (14d ago) 33d 10.10.13.202 k8t-node-1 <none> <none>
kube-proxy-7lkjg 1/1 Running 2 (14d ago) 33d 10.10.13.201 k8t-master-1 <none> <none>
kube-proxy-8djmb 1/1 Running 2 (4d16h ago) 33d 10.10.13.203 k8t-node-2 <none> <none>
kube-proxy-windows-9rqgv 1/1 Running 5 (13d ago) 14d 10.10.13.204 k8t-win-node-1 <none> <none>
kube-scheduler-k8t-master-1 1/1 Running 3 (14d ago) 33d 10.10.13.201 k8t-master-1 <none> <none>
snapshot-controller-9695c8478-4xbdj 1/1 Running 438 (4d16h ago) 31d 10.244.2.33 k8t-node-2 <none> <none>
snapshot-controller-9695c8478-cn6lt 1/1 Running 1 (14d ago) 31d 10.244.1.18 k8t-node-1 <none> <none>
$ kubectl get pods -n kube-flannel -o=wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-flannel-ds-8vvv2 1/1 Running 1 (14d ago) 14d 10.10.13.201 k8t-master-1 <none> <none>
kube-flannel-ds-94v42 1/1 Running 1 (4d16h ago) 14d 10.10.13.203 k8t-node-2 <none> <none>
kube-flannel-ds-hhzhk 1/1 Running 0 14d 10.10.13.202 k8t-node-1 <none> <none>
kube-flannel-ds-windows-amd64-4wkmb 1/1 Running 0 23h 10.10.13.204 k8t-win-node-1 <none> <none>
$kubectl exec -it my-windows-pod -- powershell
PS C:\> ipconfig /all
Windows IP Configuration
Host Name . . . . . . . . . . . . : my-windows-pod
Primary Dns Suffix . . . . . . . :
Node Type . . . . . . . . . . . . : Hybrid
IP Routing Enabled. . . . . . . . : No
WINS Proxy Enabled. . . . . . . . : No
DNS Suffix Search List. . . . . . : default.svc.cluster.local
svc.cluster.local
cluster.local
Ethernet adapter vEthernet (5195b5da3e3bb0b8f92bcbdfce384d2c2a7eac5e55220691f91bdb64dd671f1a_flannel.4096):
Connection-specific DNS Suffix . : default.svc.cluster.local
Description . . . . . . . . . . . : Hyper-V Virtual Ethernet Adapter #4
Physical Address. . . . . . . . . : 00-15-5D-34-6F-58
DHCP Enabled. . . . . . . . . . . : No
Autoconfiguration Enabled . . . . : Yes
Link-local IPv6 Address . . . . . : fe80::3c2d:f9da:9aec:d253%29(Preferred)
IPv4 Address. . . . . . . . . . . : 10.244.4.28(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.255.0
Default Gateway . . . . . . . . . : 10.244.4.1
DNS Servers . . . . . . . . . . . : 10.96.0.10
NetBIOS over Tcpip. . . . . . . . : Disabled
Connection-specific DNS Suffix Search List :
default.svc.cluster.local
svc.cluster.local
cluster.local
PS C:\> nslookup www.google.de
DNS request timed out.
timeout was 2 seconds.
Server: UnKnown
Address: 10.96.0.10
DNS request timed out.
timeout was 2 seconds.
PS C:\> nslookup www.google.de 10.10.13.1 # 10.10.13.1 is my external dns
Server: UnKnown
Address: 10.10.13.1
Non-authoritative answer:
Name: www.google.de
Addresses: 2a00:1450:4016:80c::2003
172.217.16.163
PS C:\> nslookup www.google.de 10.244.0.20
Server: 10-244-0-20.kube-dns.kube-system.svc.cluster.local
Address: 10.244.0.20
Non-authoritative answer:
Name: www.google.de
Addresses: 2a00:1450:4016:80c::2003
172.217.16.163
PS C:\> Test-NetConnection -ComputerName 10.96.0.10 -Port 53 -InformationLevel Detailed
WARNING: TCP connect to (10.96.0.10 : 53) failed
WARNING: Ping to 10.96.0.10 failed with status: TimedOut
ComputerName : 10.96.0.10
RemoteAddress : 10.96.0.10
RemotePort : 53
NameResolutionResults : 10.96.0.10
MatchingIPsecRules :
NetworkIsolationContext :
InterfaceAlias : vEthernet (5195b5da3e3bb0b8f92bcbdfce384d2c2a7eac5e55220691f91bdb64dd671f1a_flannel.4096)
SourceAddress : 10.244.4.28
NetRoute (NextHop) : 10.244.4.1
PingSucceeded : False
PingReplyDetails (RTT) : 0 ms
TcpTestSucceeded : False
PS C:\> Test-NetConnection -ComputerName 10.10.13.1 -Port 53 -InformationLevel Detailed
ComputerName : 10.10.13.1
RemoteAddress : 10.10.13.1
RemotePort : 53
NameResolutionResults : 10.10.13.1
MatchingIPsecRules :
NetworkIsolationContext :
InterfaceAlias : vEthernet (5195b5da3e3bb0b8f92bcbdfce384d2c2a7eac5e55220691f91bdb64dd671f1a_flannel.4096)
SourceAddress : 10.244.4.28
NetRoute (NextHop) : 10.244.4.1
TcpTestSucceeded : True
PS C:\> Test-NetConnection -ComputerName 10.244.0.20 -Port 53 -InformationLevel Detailed
ComputerName : 10.244.0.20
RemoteAddress : 10.244.0.20
RemotePort : 53
NameResolutionResults : 10.244.0.20
MatchingIPsecRules :
NetworkIsolationContext :
InterfaceAlias : vEthernet (5195b5da3e3bb0b8f92bcbdfce384d2c2a7eac5e55220691f91bdb64dd671f1a_flannel.4096)
SourceAddress : 10.244.4.28
NetRoute (NextHop) : 10.244.4.1
TcpTestSucceeded : True
Expected behavior PS C:> nslookup www.google.de should work.
Kubernetes (please complete the following information):
- Windows Server version: inside POD:
PS C:\> Get-ComputerInfo | Select-Object WindowsVersion
WindowsVersion
--------------
1809
Outside POD:
PS C:\> Get-ComputerInfo | Select-Object WindowsVersion
WindowsVersion
--------------
1809
- Kubernetes Version:
$ kubectl version --output=yaml
clientVersion:
buildDate: "2023-06-14T09:53:42Z"
compiler: gc
gitCommit: 25b4e43193bcda6c7328a6d147b1fb73a33f1598
gitTreeState: clean
gitVersion: v1.27.3
goVersion: go1.20.5
major: "1"
minor: "27"
platform: linux/arm64
kustomizeVersion: v5.0.1
serverVersion:
buildDate: "2023-06-14T09:47:40Z"
compiler: gc
gitCommit: 25b4e43193bcda6c7328a6d147b1fb73a33f1598
gitTreeState: clean
gitVersion: v1.27.3
goVersion: go1.20.5
major: "1"
minor: "27"
platform: linux/arm64
- CNI:
$ kubectl get daemonsets -n kube-flannel -o=wide
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE CONTAINERS IMAGES SELECTOR
kube-flannel-ds 3 3 3 3 3 <none> 14d kube-flannel docker.io/flannel/flannel:v0.22.0 app=flannel
kube-flannel-ds-windows-amd64 1 1 1 1 1 <none> 14d kube-flannel mik4sa/flannel:v0.21.5-hostprocess app=flannel
Additional context I'll try to test it with uweerikmartin/flannel but have no success get an error:
$ kubectl logs kube-flannel-ds-windows-amd64-qtgvk -n kube-flannel
Copying SDN CNI binaries to host
Directory: C:\opt\cni
Mode LastWriteTime Length Name
---- ------------- ------ ----
d----- 7/2/2023 3:25 PM bin
copy flannel config
Directory: C:\etc
Mode LastWriteTime Length Name
---- ------------- ------ ----
d----- 7/2/2023 4:38 PM kube-flannel
Directory: C:\etc\kube-flannel
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a---- 8/5/2023 9:16 AM 109 net-conf.json
Directory: C:\hpc\mounts\kube-flannel
Mode LastWriteTime Length Name
---- ------------- ------ ----
d----- 8/5/2023 9:16 AM ..2023_08_05_16_16_31.325047069
d----l 8/5/2023 9:16 AM ..data
-a---l 8/5/2023 9:16 AM 0 cni-conf.json
-a---l 8/5/2023 9:16 AM 0 net-conf.json
update cni config
get-content : Cannot find path 'C:\hpc\mounts\kubeadm-config\ClusterConfiguration' because it does not exist.
At C:\hpc\flannel\start.ps1:18 char:18
+ ... iceSubnet = get-content $env:CONTAINER_SANDBOX_MOUNT_POINT/mounts/kub ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : ObjectNotFound: (C:\hpc\mounts\k...erConfiguration:String) [Get-Content], ItemNotFoundEx
ception
+ FullyQualifiedErrorId : PathNotFound,Microsoft.PowerShell.Commands.GetContentCommand
Check if i could solve the error.
tested with oguertlertt/flannel:v0.22.0. Same Problem. 👎 no clue what it's wrong.
$ kubectl get daemonsets -n kube-flannel -o=wide
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE CONTAINERS IMAGES SELECTOR
kube-flannel-ds 3 3 3 3 3 <none> 14d kube-flannel docker.io/flannel/flannel:v0.22.0 app=flannel
kube-flannel-ds-windows-amd64 1 1 1 1 1 <none> 14d kube-flannel oguertlertt/flannel:v0.22.0-hostprocess app=flannel
get this errors in the Logfile od the Hostprocess:
$ kubectl logs kube-flannel-ds-windows-amd64-bf4bc -n kube-flannel -f
Copying SDN CNI binaries to host
Directory: C:\opt\cni
Mode LastWriteTime Length Name
---- ------------- ------ ----
d----- 7/2/2023 3:25 PM bin
copy flannel config
Directory: C:\etc
Mode LastWriteTime Length Name
---- ------------- ------ ----
d----- 7/2/2023 4:38 PM kube-flannel
Directory: C:\etc\kube-flannel
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a---- 8/5/2023 9:39 AM 109 net-conf.json
Directory: C:\hpc\mounts\kube-flannel
Mode LastWriteTime Length Name
---- ------------- ------ ----
d----- 8/5/2023 9:45 AM ..2023_08_05_16_45_21.1802066358
d----l 8/5/2023 9:45 AM ..data
-a---l 8/5/2023 9:45 AM 0 cni-conf.json
-a---l 8/5/2023 9:45 AM 0 net-conf.json
update cni config
Directory: C:\etc\cni
Mode LastWriteTime Length Name
---- ------------- ------ ----
d----- 7/2/2023 4:38 PM net.d
add route
The route addition failed: The object already exists.
envs
kube-flannel-ds-windows-amd64-bf4bc
kube-flannel
Starting flannel
I0805 09:45:24.184051 246316 main.go:212] CLI flags config: {etcdEndpoints:http://127.0.0.1:4001,http://127.0.0.1:2379 etcdPrefix:/coreos.com/network etcdKeyfile: etcdCertfile: etcdCAFile: etcdUsername: etcdPassword: version:false kubeSubnetMgr:true kubeApiUrl: kubeAnnotationPrefix:flannel.alpha.coreos.com kubeConfigFile: iface:[10.10.13.204] ifaceRegex:[] ipMasq:false ifaceCanReach: subnetFile:/run/flannel/subnet.env publicIP: publicIPv6: subnetLeaseRenewMargin:60 healthzIP:0.0.0.0 healthzPort:0 iptablesResyncSeconds:5 iptablesForwardRules:true netConfPath:/etc/kube-flannel/net-conf.json setNodeNetworkUnavailable:true useMultiClusterCidr:false}
W0805 09:45:24.186802 246316 client_config.go:617] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I0805 09:45:24.254216 246316 kube.go:486] Starting kube subnet manager
I0805 09:45:24.254216 246316 kube.go:145] Waiting 10m0s for node controller to sync
I0805 09:45:24.267665 246316 kube.go:507] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.0.0/24]
I0805 09:45:24.267665 246316 kube.go:507] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.1.0/24]
I0805 09:45:24.267665 246316 kube.go:507] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.2.0/24]
I0805 09:45:24.267665 246316 kube.go:507] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.4.0/24]
I0805 09:45:25.255069 246316 kube.go:152] Node controller sync successful
I0805 09:45:25.255069 246316 main.go:232] Created subnet manager: Kubernetes Subnet Manager - k8t-win-node-1
I0805 09:45:25.255069 246316 main.go:235] Installing signal handlers
I0805 09:45:25.255713 246316 main.go:543] Found network config - Backend type: vxlan
I0805 09:45:25.256988 246316 match.go:73] Searching for interface using 10.10.13.204
I0805 09:45:25.272671 246316 match.go:259] Using interface with name vEthernet (Ethernet) and address 10.10.13.204
I0805 09:45:25.275260 246316 match.go:281] Defaulting external address to interface address (10.10.13.204)
I0805 09:45:25.275327 246316 vxlan_windows.go:126] VXLAN config: Name=flannel.4096 MacPrefix=0E-2A VNI=4096 Port=4789 GBP=false DirectRouting=false
time="2023-08-05T09:45:25-07:00" level=info msg="HCN feature check" supportedFeatures="{Acl:{AclAddressLists:true AclNoHostRulePriority:true AclPortRanges:true AclRuleId:true} Api:{V1:true V2:true} RemoteSubnet:true HostRoute:true DSR:true Slash32EndpointPrefixes:true AclSupportForProtocol252:false SessionAffinity:false IPv6DualStack:false SetPolicy:false VxlanPort:false L4Proxy:true L4WfpProxy:false TierAcl:false NetworkACL:false NestedIpSet:false}" version="{Major:9 Minor:5}"
I0805 09:45:25.354942 246316 kube.go:507] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.4.0/24]
I0805 09:45:25.354994 246316 device_windows.go:103] Found existing HostComputeNetwork flannel.4096
I0805 09:45:25.381913 246316 main.go:408] Changing default FORWARD chain policy to ACCEPT
I0805 09:45:25.383161 246316 kube.go:507] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.4.0/24]
I0805 09:45:25.383739 246316 main.go:436] Wrote subnet file to /run/flannel/subnet.env
I0805 09:45:25.383739 246316 main.go:440] Running backend.
I0805 09:45:25.383739 246316 vxlan_network_windows.go:63] Watching for new subnet leases
I0805 09:45:25.383739 246316 subnet.go:159] Batch elem [0] is { lease.Event{Type:0, Lease:lease.Lease{EnableIPv4:true, EnableIPv6:false, Subnet:ip.IP4Net{IP:0xaf40000, PrefixLen:0x18}, IPv6Subnet:ip.IP6Net{IP:(*ip.IP6)(nil), PrefixLen:0x0}, Attrs:lease.LeaseAttrs{PublicIP:0xa0a0dc9, PublicIPv6:(*ip.IP6)(nil), BackendType:"vxlan", BackendData:json.RawMessage{0x7b, 0x22, 0x56, 0x4e, 0x49, 0x22, 0x3a, 0x34, 0x30, 0x39, 0x36, 0x2c, 0x22, 0x56, 0x74, 0x65, 0x70, 0x4d, 0x41, 0x43, 0x22, 0x3a, 0x22, 0x38, 0x32, 0x3a, 0x62, 0x66, 0x3a, 0x66, 0x32, 0x3a, 0x33, 0x65, 0x3a, 0x30, 0x65, 0x3a, 0x62, 0x33, 0x22, 0x7d}, BackendV6Data:json.RawMessage(nil)}, Expiration:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Asof:0}} }
I0805 09:45:25.383739 246316 subnet.go:159] Batch elem [0] is { lease.Event{Type:0, Lease:lease.Lease{EnableIPv4:true, EnableIPv6:false, Subnet:ip.IP4Net{IP:0xaf40100, PrefixLen:0x18}, IPv6Subnet:ip.IP6Net{IP:(*ip.IP6)(nil), PrefixLen:0x0}, Attrs:lease.LeaseAttrs{PublicIP:0xa0a0dca, PublicIPv6:(*ip.IP6)(nil), BackendType:"vxlan", BackendData:json.RawMessage{0x7b, 0x22, 0x56, 0x4e, 0x49, 0x22, 0x3a, 0x34, 0x30, 0x39, 0x36, 0x2c, 0x22, 0x56, 0x74, 0x65, 0x70, 0x4d, 0x41, 0x43, 0x22, 0x3a, 0x22, 0x64, 0x32, 0x3a, 0x30, 0x33, 0x3a, 0x35, 0x62, 0x3a, 0x34, 0x32, 0x3a, 0x32, 0x30, 0x3a, 0x33, 0x39, 0x22, 0x7d}, BackendV6Data:json.RawMessage(nil)}, Expiration:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Asof:0}} }
I0805 09:45:25.388437 246316 subnet.go:159] Batch elem [0] is { lease.Event{Type:0, Lease:lease.Lease{EnableIPv4:true, EnableIPv6:false, Subnet:ip.IP4Net{IP:0xaf40200, PrefixLen:0x18}, IPv6Subnet:ip.IP6Net{IP:(*ip.IP6)(nil), PrefixLen:0x0}, Attrs:lease.LeaseAttrs{PublicIP:0xa0a0dcb, PublicIPv6:(*ip.IP6)(nil), BackendType:"vxlan", BackendData:json.RawMessage{0x7b, 0x22, 0x56, 0x4e, 0x49, 0x22, 0x3a, 0x34, 0x30, 0x39, 0x36, 0x2c, 0x22, 0x56, 0x74, 0x65, 0x70, 0x4d, 0x41, 0x43, 0x22, 0x3a, 0x22, 0x38, 0x32, 0x3a, 0x61, 0x33, 0x3a, 0x35, 0x39, 0x3a, 0x36, 0x63, 0x3a, 0x63, 0x36, 0x3a, 0x62, 0x61, 0x22, 0x7d}, BackendV6Data:json.RawMessage(nil)}, Expiration:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Asof:0}} }
I0805 09:45:25.426776 246316 main.go:461] Waiting for all goroutines to exit
....
I have no time this weekend to assist you. I haven't read your description deeply tbh, but try to:
- build your own images for flannel and kube-proxy (I'm using v0.21.5 for flannel currently)
- Exactly follow the guide step by step but use your own images
- Then everything should work
Note: The RBAC file you are executing is no longer needed and should be deleted from the cluster when upgrading
Okay i'll check this but unfortunately i a very newbee in Kubernetes/docker so build a Windows Image ist difficult :-) but i'll try this and come back to you
Note: The RBAC file you are executing is no longer needed and should be deleted from the cluster when upgrading i've seen this and check if the RBAC is the correct one.
do you know why there is no newer version in sigwindowstools? and if this is en error in the documentation
@uli-fischer I built this recently: docker.io/syck0/flannel:v0.21.5-hostprocess. It works for me. Try it out
use your own images or use [Mik4sa] v0.21.5 for flannel currently Please see #336
OK I've tested it with all versions mentioned. No changes here. Upon further investigation, I found this error in the Windows kube proxy. I think that could be the error, but have no idea what's wrong.
I0822 07:12:51.168375 23324 config.go:133] "Calling handler.OnEndpointSliceUpdate"
I0822 07:13:01.157594 23324 config.go:133] "Calling handler.OnEndpointSliceUpdate"
I0822 07:13:11.162851 23324 config.go:133] "Calling handler.OnEndpointSliceUpdate"
I0822 07:13:14.464328 23324 hns.go:135] "Queried endpoints from network" network="flannel.4096"
I0822 07:13:14.464441 23324 hns.go:136] "Queried endpoints details" network="flannel.4096" endpointInfos=map[10.244.7.3:10.244.7.3:0 8f57c1ba-d61d-4a9c-9a92-5dadf07250dc:10.244.7.3:0]
I0822 07:13:14.464441 23324 hns.go:306] "Queried load balancers" count=0
E0822 07:13:14.477518 23324 proxier.go:1236] "Source Vip endpoint creation failed" err="hcnCreateEndpoint failed in Win32: IP address is either invalid or not part of any configured subnet(s). (0x803b001e) {\"Success\":false,\"Error\":\"IP address is either invalid or not part of any configured subnet(s). \",\"ErrorCode\":2151350302}"
I0822 07:13:14.477693 23324 proxier.go:1177] "Syncing proxy rules complete" elapsed="18.6334ms"
I0822 07:13:14.477693 23324 bounded_frequency_runner.go:296] sync-runner: ran, next possible in 1s, periodic in 30s
Hi Pexeus Sorry ive not find an solution for this up to now. My next step ist to buld my own Images and try it once mor. biut no time up to now. if you find a solution pleas let me Know
How did you guys initialized your cluster with kubeadm? Do you still have the exact command?
Hi
as dokumentet i've don this sudo kubeadm init --pod-network-cidr=10.244.0.0/16 on the debian Master node
Hmm this is actually the same command I used
troubleshooting this issue as well... trying out a bunch of new stuff... considering refactoring my setup to host-gw...
- k8s v1.28.2
- windows server 2022
- flannel v0.24.0
as for this error in kube proxy :
E0822 07:13:14.477518 23324 proxier.go:1236] "Source Vip endpoint creation failed" err="hcnCreateEndpoint failed in Win32: IP address is either invalid or not part of any configured subnet(s). (0x803b001e) {\"Success\":false,\"Error\":\"IP address is either invalid or not part of any configured subnet(s). \",\"ErrorCode\":2151350302}"
check the kube-proxy start script. did you unjoin and rejoin your windows worker node to the cluster ? flannel probably decided to pick a new 10.244.X.0/24 subnet for your node. the logic in the script checks for an existing file:
https://github.com/kubernetes-sigs/sig-windows-tools/blob/master/hostprocess/flannel/kube-proxy/start.ps1#L9-L10
try deleting contents in C:\sourcevip
and restart windows kube-proxy
as for the reported issue, the following stands out to me. inside of the test my-windows-pod
, the vEthernet adapter looks configured properly...
PS C:\> ipconfig /all
Windows IP Configuration
Host Name . . . . . . . . . . . . : my-windows-pod
Primary Dns Suffix . . . . . . . :
Node Type . . . . . . . . . . . . : Hybrid
IP Routing Enabled. . . . . . . . : No
WINS Proxy Enabled. . . . . . . . : No
DNS Suffix Search List. . . . . . : development.svc.cluster.local
svc.cluster.local
cluster.local
Ethernet adapter vEthernet (d804a0f1ccc4bceb0754f85022a8a16fb9db520b948689f7a4d9ba4b26c44082_flannel.4096):
Connection-specific DNS Suffix . : development.svc.cluster.local
Description . . . . . . . . . . . : Hyper-V Virtual Ethernet Container Adapter #4
Physical Address. . . . . . . . . : 00-15-5D-CD-19-B6
DHCP Enabled. . . . . . . . . . . : No
Autoconfiguration Enabled . . . . : Yes
Link-local IPv6 Address . . . . . : fe80::be82:2ea:9ff1:e51f%53(Preferred)
IPv4 Address. . . . . . . . . . . : 10.244.11.7(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.255.0
Default Gateway . . . . . . . . . : 10.244.11.1
DNS Servers . . . . . . . . . . . : 10.96.0.10
NetBIOS over Tcpip. . . . . . . . : Disabled
Connection-specific DNS Suffix Search List :
development.svc.cluster.local
svc.cluster.local
cluster.local
... but the routes are screwed up. i expect to see something for 10.244.0.0/16
at least in if53
PS C:\> Get-NetRoute
ifIndex DestinationPrefix NextHop RouteMetric ifMetric PolicyStore
------- ----------------- ------- ----------- -------- -----------
53 255.255.255.255/32 0.0.0.0 256 25 ActiveStore
52 255.255.255.255/32 0.0.0.0 256 75 ActiveStore
53 224.0.0.0/4 0.0.0.0 256 25 ActiveStore
52 224.0.0.0/4 0.0.0.0 256 75 ActiveStore
52 127.255.255.255/32 0.0.0.0 256 75 ActiveStore
52 127.0.0.1/32 0.0.0.0 256 75 ActiveStore
52 127.0.0.0/8 0.0.0.0 256 75 ActiveStore
53 10.244.11.255/32 0.0.0.0 256 25 ActiveStore
53 10.244.11.7/32 0.0.0.0 256 25 ActiveStore
53 10.244.11.0/24 0.0.0.0 256 25 ActiveStore
53 0.0.0.0/0 10.244.11.1 256 25 ActiveStore
everything linux is working fine.
my theory is
- some of the internal windows flannel machinery isn't creating routes properly or logging it
- my CNI configuration is wrong, although i started with the example in this repository and haven't deviated much
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale
- Close this issue with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten
- Close this issue with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied- After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied- After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closedYou can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.