kismatic icon indicating copy to clipboard operation
kismatic copied to clipboard

Kismatic confirms completed smoke testing before cluster is clean.

Open swade1987 opened this issue 6 years ago • 15 comments

BUG REPORT:

What happened:

Running Smoke Test==================================================================
Smoke Test Master Node                                                          [OK]
Smoke Test Master Node                                                          [OK]

The cluster was installed successfully!

- To use the generated kubeconfig file with kubectl:
    * use "./kubectl --kubeconfig generated/kubeconfig"
    * or copy the config file "cp generated/kubeconfig ~/.kube/config"
- To view the Kubernetes dashboard: "./kismatic dashboard"
- To SSH into a cluster node: "./kismatic ssh etcd|master|worker|storage|$node.host"

root@bootstrap:~# kubectl get pods
NAME                                READY     STATUS        RESTARTS   AGE
kuberang-busybox-3955436121-c694z   1/1       Terminating   0          37s

What you expected to happen:

The post installation checks shouldn't pass until everything has been cleaned up.

swade1987 avatar Sep 25 '17 08:09 swade1987

Is there a reason for waiting until the smoke test resources are cleaned up? Adding that will only increase the installation time, and I am not sure I see the benefit.

alexbrand avatar Sep 25 '17 10:09 alexbrand

Yes if that pod doesn't terminate I have something running in my cluster which shouldn't be there. Therefore, in my opinion, the install should fail.

swade1987 avatar Sep 25 '17 11:09 swade1987

This happened again today ...

root@bootstrap:~# kubectl get pods
NAME                                READY     STATUS        RESTARTS   AGE
kuberang-busybox-3955436121-3rjtk   1/1       Terminating   0          22s
kuberang-busybox-3955436121-xs8nv   0/1       Terminating   0          45s
kuberang-nginx-701780030-f0wt8      0/1       Terminating   0          22s
kuberang-nginx-701780030-nzhbr      0/1       Terminating   0          22s

swade1987 avatar Sep 30 '17 13:09 swade1987

This is expected to happen as we are currently not waiting until the deployments are completely gone.

I would argue we shouldn't wait, given that we are not smoke testing the ability to delete pods, and that adding the wait will only increase the installation time without much benefit.

alexbrand avatar Oct 02 '17 11:10 alexbrand

We should wait, the testing cycle hasn't completed until its tidied itself up. We want the cluster to be in a clean state when the execution of Kismatic has completed.

swade1987 avatar Oct 09 '17 16:10 swade1987

Surely KET can do this?

swade1987 avatar Nov 30 '17 17:11 swade1987

I'm with alex on this one. What's the real benefit?

based64god avatar Nov 30 '17 17:11 based64god

Because KET should provide the user a clean, ready to use cluster.

If kuberang is still running its not clean its still being provisioned or in a clean up mode.

swade1987 avatar Nov 30 '17 18:11 swade1987

I don't think this necessarily affects that though? This change would add another ~1 minute to the install time. Having kuberang still running doesn't affect your ability to deploy, or is it?

based64god avatar Nov 30 '17 18:11 based64god

I think a better solution would be to run the tests in a new ns and clean up the whole ns after a test.

dkoshkin avatar Nov 30 '17 18:11 dkoshkin

@emmetthitz its not affecting it but I would something to respond back with "ready" when its actually ready and not still in a "testing" or "cleanup" state.

@dkoshkin that'd work.

swade1987 avatar Nov 30 '17 18:11 swade1987

I see what you mean. Alright, yeah. Makes sense.

based64god avatar Nov 30 '17 18:11 based64god

The namespace idea is probably the best, but I would avoid waiting until all resources are cleaned up. All that is going to do is slow down the overall installation time with not much benefit.

alexbrand avatar Nov 30 '17 18:11 alexbrand

@alexbrand yeah, I'm still with you on this one, but I think @swade1987 makes a valid point of this potentially being misleading. It would be a change in the choice of words only, maybe to something like "Deployable"?

based64god avatar Nov 30 '17 18:11 based64god

If we go the namespace route, we need to make sure the resources are removed before removing the ns... seems to be an open issue for this: https://github.com/kubernetes/kubernetes/issues/36891 So we would be waiting additional time anyway, so we do not end up with orphaned resources.

From a user experience perspective, I'm happier waiting another minute to install than running get pods and seeing resources deployed.

jaycoon avatar Nov 30 '17 19:11 jaycoon