vHive icon indicating copy to clipboard operation
vHive copied to clipboard

Fix one node install taint issue

Open Joffref opened this issue 3 years ago • 2 comments
trafficstars

It seems that the following taint is applied to the single node node-role.kubernetes.io/control-plane. It causes deployment suspension due to pod affinity.

Image for reference

kubectl get nodes image kubectl describe node vhive-sched-bench | grep Taints image

Joffref avatar Oct 04 '22 09:10 Joffref

hi @Joffref, thank you for reporting the problem. could you please elaborate on what you tried to deploy and why it is assigned the control-plane role?

ustiugov avatar Oct 04 '22 13:10 ustiugov

Hi @ustiugov ! I've follow the getting started documentation on how to deploy vHive on a single node and I clearly don't know why this label is assigned to the node.

Joffref avatar Oct 05 '22 17:10 Joffref

@Joffref is this problem still relevant? if so, please provide the exact steps how we can reproduce it.

ustiugov avatar Nov 03 '22 08:11 ustiugov

This issue clearly depends on #606. Indeed, since few versions master has been renamed to control-plane in kubernetes thus as said in #606 kubeadm download is not handled by the install scripted forcing the user to manually install kubeadm. Causing this issue as the version is not the correct one. I'm closing this pull request since this just a result of a root cause.

Joffref avatar Nov 05 '22 16:11 Joffref

#606 is a sporadic bug, hence it cannot be the root cause of your problem. otherwise please elaborate.

@cvetkovic can you help make sense out of this PR/issue? I am still missing something here

ustiugov avatar Nov 05 '22 17:11 ustiugov

During the installation, kubeam failed to install as described in the issue I mentioned. So, I've downloaded kubeadm by my own, causing a version gap thus a taint problem.

I've infer that kubeam has to be installed before running the getting started, which is not the case 😅

Joffref avatar Nov 05 '22 20:11 Joffref

#606 is a sporadic bug, hence it cannot be the root cause of your problem. otherwise please elaborate.

@cvetkovic can you help make sense out of this PR/issue? I am still missing something here

I suspect the mirror from which the kubeadm installation is pulled does not contain the version we used. This is something I have experienced in practice on Cloudlab. One geolocation has it as it uses a different mirror, the other does not.

@Joffref Can you manually check if the package manager can fetch kubeadm and kubectl, as in the https://github.com/vhive-serverless/vHive/blob/main/scripts/install_stock.sh#L52.

Also, this PR should not be merged, as we still do not support Kubernetes v1.25. We currently use v1.23.

@ustiugov Maybe installing Kubernetes without package managers would prevent occurrences of this issue in the future.

cvetkovic avatar Nov 08 '22 08:11 cvetkovic

@cvetkovic Unfortunately, I'm no more using this OS thus I can't test.

Joffref avatar Nov 14 '22 13:11 Joffref

I am closing this issue. Please re-open if it becomes relevant.

ustiugov avatar Nov 15 '22 10:11 ustiugov