kne icon indicating copy to clipboard operation
kne copied to clipboard

Issue deploying `deploy/kne/external-multinode.yaml` on existing Kubernetes cluster

Open nleiva opened this issue 1 year ago • 3 comments

Hi,

I'm attempting to deploy deploy/kne/external-multinode.yaml on an existing Kubernetes cluster. I modified the Docker network name from multinode to docker0 to avoid manually creating a new network (network: docker0).

The MetalLB pods appear healthy.

$ kubectl get pods -n metallb-system
NAME                         READY   STATUS    RESTARTS   AGE
controller-fdfbfbc77-jnf4q   1/1     Running   0          4m40s
speaker-l8g66                1/1     Running   0          4m40s
speaker-lp7bv                1/1     Running   0          4m40s
speaker-vrj7z                1/1     Running   0          4m40s

However, I get the following error message, which I don't fully understand:

$ kne deploy deploy/kne/external-multinode.yaml
I0306 18:34:02.667047  465400 deploy.go:195] Deploying cluster...
I0306 18:34:02.667437  465400 deploy.go:381] Deploy is a no-op for the external cluster type
I0306 18:34:02.667458  465400 deploy.go:199] Cluster deployed
...
I0306 18:34:07.049975  465400 deploy.go:1348] Waiting on deployment "metallb-system" to be healthy
Error: failed to deploy ingress: metallb not healthy: invalid object type: *v1.Deployment

Is this the deployment is complaining about?

$ kubectl get deploy -n metallb-system
NAME         READY   UP-TO-DATE   AVAILABLE   AGE
controller   1/1     1            1           28m

Thanks in advance for your help!

nleiva avatar Mar 06 '25 18:03 nleiva

I can't help with that specific issue, but fwiw on my physical cluster I setup metallb independently and tell KNE to not try to (I set the manifest to "").

chrisy avatar Mar 06 '25 19:03 chrisy

Long time no see Chris! I'll give that a go, thank you!

nleiva avatar Mar 06 '25 19:03 nleiva

A small update in case someone runs into this. TL;DR I'm probably running a K8s version that's too new.

The error is triggered in deploy/deploy.go

func deploymentHealthy(ctx context.Context, c kubernetes.Interface, name string) error {
	log.Infof("Waiting on deployment %q to be healthy", name)
	w, err := c.AppsV1().Deployments(name).Watch(ctx, metav1.ListOptions{})
	// ...
	ch := w.ResultChan()
	for {
		select {
		//...
		case e, ok := <-ch:
			if !ok {
				return fmt.Errorf("watch channel closed before %q healthy", name)
			}

			d, ok := e.Object.(*appsv1.Deployment)
			
			if !ok {
				return fmt.Errorf("invalid object type: %T", d)
			}

I added this before the return statement:

status, ok := e.Object.(*metav1.Status)
if ok {
	fmt.Printf("EVENT -> %v, %v\n", status.Status, status.Message)
}

Got: EVENT -> Failure, Timeout: Too large resource version: 2906652, current: 2903959

According to ChatGPT, in Kubernetes, "Too large resource version" means a client (like a kubelet or operator) is trying to access the Kubernetes API using a resource version that is older than the current version held by the API server's watch cache, and therefore out of date. This can lead to loop errors.

nleiva avatar Mar 12 '25 15:03 nleiva