Meridio icon indicating copy to clipboard operation
Meridio copied to clipboard

Install Meridio on kind cluster: proxy-load-balancer-a1-xxx stuck failed Readiness probe

Open dezenxi opened this issue 1 year ago • 7 comments

Hi, I'm following up instructions https://meridio.nordix.org/docs/demo/multus-kind-ovs/#installation to install Meridio on kind cluster.

However, some pods got not-ready state as below ipam-trench-a-0 1/1 Running 0 6m41s meridio-operator-548db54687-njlcw 1/1 Running 0 7m35s nsp-trench-a-0 1/1 Running 0 6m41s proxy-load-balancer-a1-9mfq6 0/1 Running 0 6m40s proxy-load-balancer-a1-p5mrg 0/1 Running 0 6m39s stateless-lb-frontend-attr-1-7b6d9d56f7-6gtd4 0/2 Init:CrashLoopBackOff 6 (35s ago) 6m41s stateless-lb-frontend-attr-1-7b6d9d56f7-9x4sv 0/2 Init:CrashLoopBackOff 6 (42s ago) 6m41s

k describe pod proxy-load-balancer-a1-9mfq6 -n red Warning Unhealthy 17s (x29 over 4m17s) kubelet Readiness probe failed: service unhealthy (responded with "NOT_SERVING")

k describe pod stateless-lb-frontend-attr-1-7b6d9d56f7-6gtd4 -n red Args: -c sysctl -w net.ipv4.conf.all.forwarding=1 ; sysctl -w net.ipv4.fib_multipath_hash_policy=1 ; sysctl -w net.ipv4.conf.all.rp_filter=0 ; sysctl -w net.ipv4.conf.default.rp_filter=0 ; sysctl -w net.ipv6.conf.all.forwarding=1 ; sysctl -w net.ipv6.fib_multipath_hash_policy=1 State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 1 Started: Tue, 13 Jun 2023 23:35:43 +1000 Finished: Tue, 13 Jun 2023 23:35:43 +1000 Ready: False Warning BackOff 11m (x4 over 12m) kubelet Back-off restarting failed container

I followed strictly the step in https://meridio.nordix.org/docs/demo/multus-kind-ovs/#installation Still don't know why pod of proxy and frond-end are not up/running I need your help to install Meridio on kind cluster. Is there any prerequisite when installing Meridio on kind cluster?

Regards, Duong

dezenxi avatar Jun 13 '23 13:06 dezenxi

I haven't tried these instructions recently. You can try this to setup your KinD cluster (Spire, NSM, GW/TG):

make -s -C ./docs/demo/scripts/kind

And then you can install Meridio like this:

helm install meridio-crds https://artifactory.nordix.org/artifactory/cloud-native/meridio/Meridio-CRDs-v1.0.6.tgz --create-namespace
helm install meridio https://artifactory.nordix.org/artifactory/cloud-native/meridio/Meridio-v1.0.6.tgz --create-namespace

And Multus you can install in the same way

kubectl apply -f https://raw.githubusercontent.com/Nordix/xcluster/master/ovl/multus/multus-install.yaml
kubectl apply -f https://raw.githubusercontent.com/k8snetworkplumbingwg/whereabouts/master/doc/crds/daemonset-install.yaml
kubectl apply -f https://raw.githubusercontent.com/k8snetworkplumbingwg/whereabouts/master/doc/crds/whereabouts.cni.cncf.io_ippools.yaml
kubectl apply -f https://raw.githubusercontent.com/k8snetworkplumbingwg/whereabouts/master/doc/crds/whereabouts.cni.cncf.io_overlappingrangeipreservations.yaml

LionelJouin avatar Jun 13 '23 14:06 LionelJouin

Hi @LionelJouin ,

Thanks for responses, I followed your instructions, but still failed at step kubectl apply -f docs/demo/multus-meridio.yaml -n red

The pods proxy-load-balancer still not ready

Readiness probe failed: service unhealthy (responded with "NOT_SERVING")

Pod stateless-lb-frontend-attr-1 were stuck at INIT/crash state as before.

Regards, Duong

dezenxi avatar Jun 14 '23 04:06 dezenxi

In that case the first thing to check is if the interface in the attractor instance exist and what is the bird status.

It seems similar to this tutorial case: https://meridio.nordix.org/training/troubleshooting-ctf/scenario-2

The logs of the frontend container could be also useful

LionelJouin avatar Jun 14 '23 05:06 LionelJouin

Hi @LionelJouin ,

Do I need to create NetworkAttachmentDefinition meridio-nad like in ? https://meridio.nordix.org/docs/demo/multus-kind-ovs/#installation

Regards, Duong

dezenxi avatar Jun 14 '23 13:06 dezenxi

If you use a network-attachment type of interface in your attractor, then yes, you will need to create a NAD (NetworkAttachmentDefinition)

LionelJouin avatar Jun 14 '23 14:06 LionelJouin

Hi @LionelJouin , I updated multus-meridio.yaml/Attractor to use type nsm-vlan

    name: ext-vlan0
    ipv4-prefix: 169.254.100.0/24
    ipv6-prefix: 100:100::/64
    type: nsm-vlan
    nsm-vlan:
      vlan-id: 100
      base-interface: eth0

Now, all pods are up and running

NAME                                            READY   STATUS    RESTARTS   AGE
ipam-trench-a-0                                 1/1     Running   0          36m
meridio-operator-598994c599-6mdjp               1/1     Running   0          46m
nse-vlan-attr-1-b4659bfb4-k672g                 1/1     Running   0          36m
nsp-trench-a-0                                  1/1     Running   0          36m
proxy-load-balancer-a1-lb5tx                    1/1     Running   0          36m
proxy-load-balancer-a1-n57vk                    1/1     Running   0          36m
stateless-lb-frontend-attr-1-5d4d576cc6-pmt78   3/3     Running   0          36m
stateless-lb-frontend-attr-1-5d4d576cc6-qqb97   3/3     Running   0          36m

I'll try to use network-attachment type later. Thank you very much.

Regards, Duong

dezenxi avatar Jun 15 '23 00:06 dezenxi

Hi @LionelJouin , Do you think Frontend should check external interface before starting bird (bgp/bfd)? I can see in case of static config that frontend reach ready state although the interface missing. So, explicitly check and printout interface missing would help reduce a lot of time of troubleshooting, such as "Missing interface XYZ , check attractor config or CNI...."

Regards, Duong

Regards, Duong

dezenxi avatar Jul 01 '23 10:07 dezenxi