kuttl icon indicating copy to clipboard operation
kuttl copied to clipboard

Wait for cluster resources to be ready before starting tests

Open obeleh opened this issue 3 years ago • 4 comments

I'm going to try to write this request in an open fashion. I don't want to force one improvement while perhaps it's better to look at the greater picture. I want to start with an example of "why" and then offer a few suggestions on "how"

the why

In my case I was using a kuttl test suite that started a Kind cluster. It allows use to provide manifests to apply to the cluster but if the cluster is not yet in a fully ready state it makes the tests flaky. In my case I needed DNS to be available. Tests were failing because the coreDNS container wasn't ready yet. The operator I'm trying to write and trying to test with kuttl fails making connections to postgres if the pod DNS is not ready yet.

the how My original answer to this was going to be that Kuttl should wait for DNS to be available. But perhaps it's better it is possible to be able to declare for which resources kuttl should wait before it starts running the tests. Especially if the user provided manifests that contain resources that take a moment before they're ready like Deployments. So perhaps it would be also better that you can provide a readyness test that checks whether supplied manifests a ready to be used.

Here's how I've fixed it for now:

apiVersion: kuttl.dev/v1beta1
kind: TestSuite
name: Postgres
testDirs:
- ./tests/postgres/
manifestDirs:
- ./tests/postgres-manifests/
kindContainers:
- library/postgres:latest
commands:
# wait for DNS to be available to avoid flaky tests
- command: kubectl wait --timeout=2m --for=condition=available deployment coredns -n kube-system
- command: kubectl get deployment -A
# deploy our operator
- command: make deploy-kind
# wait a bunch
- command: kubectl get deployment -A
- command: kubectl wait --timeout=3m --for=condition=available deployment postgres-db-server -n postgres
- command: kubectl get deployment -A
- command: kubectl wait --timeout=1m --for=condition=available deployment db-operator-controller-manager -n db-operator-system
- command: kubectl get deployment -A
# wait again, postgress might be restarting due to initialisation cycle
- command: kubectl wait --timeout=30s --for=condition=available deployment postgres-db-server -n postgres
- command: kubectl get deployment -A
# we're sharing the database, in order to have predictable state we don't do parallel tests
parallel: 1
artifactsDir: ./tests/outputs
kindNodeCache: true

A possible solution:

apiVersion: kuttl.dev/v1beta1
kind: TestSuite
name: Postgres
testDirs:
- ./tests/postgres/
manifestDirs:
# option1:
- path: ./tests/postgres-manifests/
   wait_command: kubectl wait --timeout=3m --for=condition=available deployment postgres-db-server -n postgres
# option2:
- path: 
  test_assert:
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: postgres-db-server
      namespace: postgres
    status:
      readyReplicas: 1
      
clusterAssertions/clusterWaits?:
# examples like above, assert or wait for DNS
-   apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: coredns
      namespace: kube-system
    status:
      readyReplicas: 1

I hope this helps. I would be interested in helping out, but preferably not alone.

obeleh avatar May 16 '21 15:05 obeleh

Thanks so much for taking the time to write this issue up. I think a lot of user have similar issues as you. I think having Asserts added in as part of a BeforeSuite (a hook that runs before all the tests start) would be a very powerful addition to building out tests. To me it signifies, "before this test even starts assert this state". Is this the use case that you're looking for, or were you looking for something else.

faiq avatar May 17 '21 19:05 faiq

I think a BeforeSuite would help yes. Would BeforeSuite happen before or after applying manifests? I think you might want something along the lines of:

  • start cluster
  • assert clusterstate (eg. DNS)
  • apply crds
  • apply manifests
  • assert pre-test status
  • run suite

I think you don't want to start applying manifests before the cluster is ready.

obeleh avatar May 18 '21 04:05 obeleh

I've been wanting a Before and After Suite for a long time... love this issue and report! We will target for a next month release.

kensipe avatar Apr 14 '22 13:04 kensipe

only took a year :) what a challenging year but we are back!

kensipe avatar Apr 14 '22 13:04 kensipe