vcluster icon indicating copy to clipboard operation
vcluster copied to clipboard

Error: failed pre-install:timed out waiting for condition

Open skhota opened this issue 3 years ago • 15 comments

What happened?

New install of Vcluster fails/times out. Pl. see the attached logs. We have a successful install in the same VM/RHEL 7 host. But something happened last month and is not allowing us to create a new Vcluster.

What did you expect to happen?

Should have a new cluster created.

How can we reproduce it (as minimally and precisely as possible)?

Tried to create a new Vcluster but it times out.

Anything else we need to know?

vcluster-log.pdf ssg-log.pdf

Host cluster Kubernetes version

$ kubectl version
# paste output here
1.21.1

</details>


### Host cluster Kubernetes distribution

<details>

Write here

1.21. x
</details>


### vlcuster version

<details>

```console
$ vcluster --version
# paste output here
Vcluster 0.7.0 and 0.10.2

</details>


### Vcluster Kubernetes distribution(k3s(default)), k8s, k0s)

<details>

Write here


</details>
K8s

### OS and Arch

<details>

OS: Arch: RHEL 7

skhota avatar Jul 20 '22 14:07 skhota

Hello @skhota I am not sure why did Helm report pre-install error. Seems like pre-install pod succeeded. Can you check if the vc10-1-job Kubernetes Job reported a successful status?

Your logs from etcd pod are showing an NFS error. Can you look into the status of your cluster/host storage?

matskiv avatar Jul 20 '22 14:07 matskiv

Can you check if the vc10-1-jobKubernetes Job reported a successful status?

Ans: The status of the job is in vcluster-log.pdf attached in the ticket.

Thanks, Susanta Hota.


From: Oleg Matskiv @.> Sent: Wednesday, July 20, 2022 9:53 AM To: loft-sh/vcluster @.> Cc: skhota @.>; Mention @.> Subject: Re: [loft-sh/vcluster] Error: failed pre-install:timed out waiting for condition (Issue #613)

Hello @skhotahttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fskhota&data=05%7C01%7C%7Cb90cc66b83684a82967a08da6a5f9736%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637939256056151133%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=X2UtlhqsqIQ4ZJsCVvUKaQyFUbsd6mxB0I3qkqGIsaM%3D&reserved=0 I am not sure why did Helm report pre-install error. Seems like pre-install pod succeeded. Can you check if the vc10-1-job Kubernetes Job reported a successful status?

Your logs from etcd pod are showing an NFS error. Can you look into the status of your cluster/host storage?

— Reply to this email directly, view it on GitHubhttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Floft-sh%2Fvcluster%2Fissues%2F613%23issuecomment-1190388519&data=05%7C01%7C%7Cb90cc66b83684a82967a08da6a5f9736%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637939256056151133%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=KmEf79P95QuXOrEyZKrptXDEOXCzYWvFHlykTsXtCk0%3D&reserved=0, or unsubscribehttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAOYB7LVIEO4J246KDQCD3UTVVAHGDANCNFSM54D2WI6Q&data=05%7C01%7C%7Cb90cc66b83684a82967a08da6a5f9736%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637939256056151133%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=XVKWhBCSONzG2Ar2fcWu429rBPdNREGsTUId9y5O7iQ%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.***>

skhota avatar Jul 20 '22 15:07 skhota

If you can reproduce pre-install error issue, please try installing vcluster with helm (see "helm" tab in this doc) and run helm install it with --debug flag. If the issue is reproduced please attach logs as a text file or a link to a gist.

matskiv avatar Jul 20 '22 17:07 matskiv

Pl. see the attached for the helm debug log Helm-debug-log.pdf s

skhota avatar Jul 21 '22 14:07 skhota

Unfortunately, you can not run helm by copying the command that was printed by vcluster. Please follow the instructions on the helm tab on this docs page to run it correctly - https://www.vcluster.com/docs/getting-started/deployment

matskiv avatar Jul 21 '22 18:07 matskiv

I am running into the below error when trying to run the command, can you help. Not sure whats wrong I'm doing: helm upgrade --install skh-1 \ --values vcluster1/vcluster/charts/k8s/vcluster.yaml
--repo vcluster1/vcluster/charts/k8s/ \
--namespace vc-test
--repository-config='' Release "skh-1" does not exist. Installing it now. Error: failed to download "" (hint: running helm repo update may help) zsh: command not found: --values zsh: command not found: --repo zsh: command not found: --namespace zsh: command not found: --repository-config=

skhota avatar Jul 22 '22 13:07 skhota

Also tried this command: helm upgrade --install skh-1 --values vcluster1/vcluster/charts/k8s/vcluster.yaml --repo vcluster1/vcluster/charts/k8s/ --namespace vc-test --repository-config='' Error: "helm upgrade" requires 2 arguments Not sure whats wrong. Can you pl. help and send me the right command

skhota avatar Jul 22 '22 13:07 skhota

@skhota try running:

helm upgrade skh-1 vcluster --debug --install --create-namespace --values vcluster1/vcluster/charts/k8s/vcluster.yaml --repo https://charts.loft.sh --namespace vc-test --repository-config=''

Also make sure there is no folder / file called helm inside the directory you run this command

FabianKramm avatar Jul 22 '22 16:07 FabianKramm

Pl. see the attached for the Helm debug logs: Helm-debug-logs.pdf

skhota avatar Jul 26 '22 15:07 skhota

Hello, Can someone please help me on this issue.

skhota avatar Jul 28 '22 13:07 skhota

@skhota mhh the logs look good, weird that the helm command times out even though the job is completed. What helm version do you have installed? You maybe need to upgrade helm itself. Also what happens if you increase the timeout to --timeout 10m? Besides that have you tried to use another distribution like k3s (just do vcluster create my-vcluster) or is there a specific reason you want to use k8s?

FabianKramm avatar Jul 28 '22 17:07 FabianKramm

I have installed successfully vcluster in the same environment in the past, not sure why its giving me this issue. K8s is the chosen distribution for our production. I can try upgrade Helm and test the install as you have asked. Will try increase the timeout to 10m and see what happens. Is there any other testing we can do to debug. What should happen in the next step after the job part runs successfully?

skhota avatar Jul 28 '22 19:07 skhota

Hi, I tried different Helm version but no change same issue. Can someone help.

skhota avatar Jul 29 '22 14:07 skhota

Anyone who can help me on this please.

skhota avatar Aug 01 '22 14:08 skhota

@skhota if the job runs through correctly, helm should deploy vcluster itself, can you deploy vcluster correctly into a different cluster or does it fail in all of them?

FabianKramm avatar Aug 01 '22 15:08 FabianKramm

This issue has had no reply for over 2 months. Closing.

matskiv avatar Oct 14 '22 12:10 matskiv