vcluster
vcluster copied to clipboard
Error: failed pre-install:timed out waiting for condition
What happened?
New install of Vcluster fails/times out. Pl. see the attached logs. We have a successful install in the same VM/RHEL 7 host. But something happened last month and is not allowing us to create a new Vcluster.
What did you expect to happen?
Should have a new cluster created.
How can we reproduce it (as minimally and precisely as possible)?
Tried to create a new Vcluster but it times out.
Anything else we need to know?
Host cluster Kubernetes version
$ kubectl version
# paste output here
1.21.1
</details>
### Host cluster Kubernetes distribution
<details>
Write here
1.21. x
</details>
### vlcuster version
<details>
```console
$ vcluster --version
# paste output here
Vcluster 0.7.0 and 0.10.2
</details>
### Vcluster Kubernetes distribution(k3s(default)), k8s, k0s)
<details>
Write here
</details>
K8s
### OS and Arch
<details>
OS: Arch: RHEL 7
Hello @skhota
I am not sure why did Helm report pre-install error. Seems like pre-install pod succeeded. Can you check if the vc10-1-job Kubernetes Job reported a successful status?
Your logs from etcd pod are showing an NFS error. Can you look into the status of your cluster/host storage?
Can you check if the vc10-1-jobKubernetes Job reported a successful status?
Ans: The status of the job is in vcluster-log.pdf attached in the ticket.
Thanks, Susanta Hota.
From: Oleg Matskiv @.> Sent: Wednesday, July 20, 2022 9:53 AM To: loft-sh/vcluster @.> Cc: skhota @.>; Mention @.> Subject: Re: [loft-sh/vcluster] Error: failed pre-install:timed out waiting for condition (Issue #613)
Hello @skhotahttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fskhota&data=05%7C01%7C%7Cb90cc66b83684a82967a08da6a5f9736%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637939256056151133%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=X2UtlhqsqIQ4ZJsCVvUKaQyFUbsd6mxB0I3qkqGIsaM%3D&reserved=0 I am not sure why did Helm report pre-install error. Seems like pre-install pod succeeded. Can you check if the vc10-1-job Kubernetes Job reported a successful status?
Your logs from etcd pod are showing an NFS error. Can you look into the status of your cluster/host storage?
— Reply to this email directly, view it on GitHubhttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Floft-sh%2Fvcluster%2Fissues%2F613%23issuecomment-1190388519&data=05%7C01%7C%7Cb90cc66b83684a82967a08da6a5f9736%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637939256056151133%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=KmEf79P95QuXOrEyZKrptXDEOXCzYWvFHlykTsXtCk0%3D&reserved=0, or unsubscribehttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAOYB7LVIEO4J246KDQCD3UTVVAHGDANCNFSM54D2WI6Q&data=05%7C01%7C%7Cb90cc66b83684a82967a08da6a5f9736%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637939256056151133%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=XVKWhBCSONzG2Ar2fcWu429rBPdNREGsTUId9y5O7iQ%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.***>
If you can reproduce pre-install error issue, please try installing vcluster with helm (see "helm" tab in this doc) and run helm install it with --debug flag. If the issue is reproduced please attach logs as a text file or a link to a gist.
Pl. see the attached for the helm debug log Helm-debug-log.pdf s
Unfortunately, you can not run helm by copying the command that was printed by vcluster. Please follow the instructions on the helm tab on this docs page to run it correctly - https://www.vcluster.com/docs/getting-started/deployment
I am running into the below error when trying to run the command, can you help. Not sure whats wrong I'm doing:
helm upgrade --install skh-1 \
--values vcluster1/vcluster/charts/k8s/vcluster.yaml
--repo vcluster1/vcluster/charts/k8s/ \
--namespace vc-test
--repository-config=''
Release "skh-1" does not exist. Installing it now.
Error: failed to download "" (hint: running helm repo update may help)
zsh: command not found: --values
zsh: command not found: --repo
zsh: command not found: --namespace
zsh: command not found: --repository-config=
Also tried this command: helm upgrade --install skh-1 --values vcluster1/vcluster/charts/k8s/vcluster.yaml --repo vcluster1/vcluster/charts/k8s/ --namespace vc-test --repository-config='' Error: "helm upgrade" requires 2 arguments Not sure whats wrong. Can you pl. help and send me the right command
@skhota try running:
helm upgrade skh-1 vcluster --debug --install --create-namespace --values vcluster1/vcluster/charts/k8s/vcluster.yaml --repo https://charts.loft.sh --namespace vc-test --repository-config=''
Also make sure there is no folder / file called helm inside the directory you run this command
Pl. see the attached for the Helm debug logs: Helm-debug-logs.pdf
Hello, Can someone please help me on this issue.
@skhota mhh the logs look good, weird that the helm command times out even though the job is completed. What helm version do you have installed? You maybe need to upgrade helm itself. Also what happens if you increase the timeout to --timeout 10m? Besides that have you tried to use another distribution like k3s (just do vcluster create my-vcluster) or is there a specific reason you want to use k8s?
I have installed successfully vcluster in the same environment in the past, not sure why its giving me this issue. K8s is the chosen distribution for our production. I can try upgrade Helm and test the install as you have asked. Will try increase the timeout to 10m and see what happens. Is there any other testing we can do to debug. What should happen in the next step after the job part runs successfully?
Hi, I tried different Helm version but no change same issue. Can someone help.
Anyone who can help me on this please.
@skhota if the job runs through correctly, helm should deploy vcluster itself, can you deploy vcluster correctly into a different cluster or does it fail in all of them?
This issue has had no reply for over 2 months. Closing.