microk8s icon indicating copy to clipboard operation
microk8s copied to clipboard

Failed to connect error after migrating to external ETCD

Open Kras4ooo opened this issue 8 months ago • 2 comments

Summary

After migrating to the external ETCD two days ago, I detected this strange issue:

Apr  1 13:09:25 minisforum microk8s.daemon-kubelite[779842]: W0401 13:09:25.263424  779842 logging.go:55] [core] [Channel #4168 SubChannel #4170]grpc: addrConn.createTransport failed to connect to {Addr: "192.168.89.252:2379", ServerName: "192.168.89.252:2379", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 192.168.89.252:2379: operation was canceled"
Apr  1 13:09:25 minisforum microk8s.daemon-kubelite[779842]: W0401 13:09:25.263424  779842 logging.go:55] [core] [Channel #4168 SubChannel #4171]grpc: addrConn.createTransport failed to connect to {Addr: "192.168.0.35:2379", ServerName: "192.168.0.35:2379", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 192.168.0.35:2379: operation was canceled"

I can confirm that I have a connection between the nodes and I can verify that via curl like this:

curl   --cacert /etcd/certs/ca/ca.pem   --cert /etcd/certs/ca/ca.pem   --key /etcd/certs/ca/ca-key.pem   -L https://192.168.88.29:2379/metrics

curl   --cacert /etcd/certs/ca/ca.pem   --cert /etcd/certs/ca/ca.pem   --key /etcd/certs/ca/ca-key.pem   -L https://192.168.0.35:2379/metrics

curl   --cacert /etcd/certs/ca/ca.pem   --cert /etcd/certs/ca/ca.pem   --key /etcd/certs/ca/ca-key.pem   -L https://192.168.89.252:2379/metrics

Tested via telnet, too.

This si my kube-apiserver file:

--cert-dir=${SNAP_DATA}/certs
--service-cluster-ip-range=10.152.183.0/24
--authorization-mode=RBAC,Node
# --authorization-mode=AlwaysAllow
--service-account-key-file=/var/snap/microk8s/aws-certs/oidc-issuer.pub
# --service-account-key-file=${SNAP_DATA}/certs/serviceaccount.key
--client-ca-file=${SNAP_DATA}/certs/ca.crt
--tls-cert-file=${SNAP_DATA}/certs/server.crt
--tls-private-key-file=${SNAP_DATA}/certs/server.key
--tls-cipher-suites=TLS_AES_128_GCM_SHA256,TLS_AES_256_GCM_SHA384,TLS_CHACHA20_POLY1305_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256,TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256,TLS_RSA_WITH_3DES_EDE_CBC_SHA,TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_256_GCM_SHA384
--kubelet-client-certificate=${SNAP_DATA}/certs/apiserver-kubelet-client.crt
--kubelet-client-key=${SNAP_DATA}/certs/apiserver-kubelet-client.key
--secure-port=16443

--etcd-servers=https://192.168.88.29:2379,https://192.168.89.252:2379,https://192.168.0.35:2379
--etcd-cafile=/etcd/certs/ca/ca.pem
--etcd-certfile=/etcd/certs/ca/ca.pem
--etcd-keyfile=/etcd/certs/ca/ca-key.pem

--allow-privileged=true
--service-account-issuer='https://s3.eu-central-1.amazonaws.com/skydevs-de-oidc/minisforum'
# --service-account-issuer='https://kubernetes.default.svc'
--api-audiences=sts.amazonaws.com
--service-account-signing-key-file=/var/snap/microk8s/aws-certs/oidc-issuer.key
# --service-account-signing-key-file=${SNAP_DATA}/certs/serviceaccount.key
--event-ttl=5m
--profiling=false

# Enable the aggregation layer
--requestheader-client-ca-file=${SNAP_DATA}/certs/front-proxy-ca.crt
--requestheader-allowed-names=front-proxy-client
--requestheader-extra-headers-prefix=X-Remote-Extra-
--requestheader-group-headers=X-Remote-Group
--requestheader-username-headers=X-Remote-User
--proxy-client-cert-file=${SNAP_DATA}/certs/front-proxy-client.crt
--proxy-client-key-file=${SNAP_DATA}/certs/front-proxy-client.key
#~Enable the aggregation layer
--enable-admission-plugins=EventRateLimit
--admission-control-config-file=${SNAP_DATA}/args/admission-control-config-file.yaml
--kubelet-certificate-authority=${SNAP_DATA}/certs/ca.crt
--kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
--feature-gates=WatchList=false

For some reason, it connects to 192.168.88.29:2379, but it fails to others. What can be the issue? PS: Microk8s is placed on the same instance where is the one of the nodes of the etcd (192.168.88.29)

Thanks in advance for your help!

Introspection Report

inspection-report-20250401_131733.tar.gz

Kras4ooo avatar Apr 01 '25 13:04 Kras4ooo

Hello @Kras4ooo,

Thanks for filing your issue and attaching the progress report.

Can you perform a health check on your etcd nodes as described in the etcd docs: https://etcd.io/docs/v3.5/tutorials/how-to-check-cluster-status/?

Best regards, Louise

louiseschmidtgen avatar Apr 03 '25 06:04 louiseschmidtgen

@louiseschmidtgen

Image

Image

Image

Kras4ooo avatar Apr 03 '25 16:04 Kras4ooo