cilium-cli icon indicating copy to clipboard operation
cilium-cli copied to clipboard

Connectivity test concurrent run

Open viktor-kurchenko opened this issue 1 year ago • 9 comments

This PR introduces connectivity test concurrent run.

The new test-concurrency input param can be used to specify a count of K8S test namespaces. So, most connectivity tests will run concurrently across test namespaces.

The new parameter is marked hidden because of the following reasons:

  1. test and stabilisation internally (e.g.: in CI)
  2. report output must be improved to avoid mess (especially for the debug and verbose modes). I want to address this in a separate PR.

Command run example: cilium connectivity test --test-concurrency=5

Local test in a 4 node kind cluster result:

viktor-kurchenko avatar Apr 22 '24 18:04 viktor-kurchenko

@viktor-kurchenko could you take a look at kind test failures? from-cidr-host-netns/from-cidr-to-pod/host-netns-to-pod seems to be failing consistently.

michi-covalent avatar Apr 22 '24 21:04 michi-covalent

let's add a matrix for the kind job and run with both --test-concurrency=1 and --test-concurrency=5: https://github.com/cilium/cilium-cli/blob/main/.github/workflows/kind.yaml#L24

I've done some testing in CI and if --test-concurrency=3 kind cluster needs 5 Cilium workers (more resources needed). Maybe let's start with 3?

viktor-kurchenko avatar May 02 '24 13:05 viktor-kurchenko

Maybe let's start with 3?

yeah any number greater than 1 is fine 🚀🙏

michi-covalent avatar May 02 '24 14:05 michi-covalent

@viktor-kurchenko could you take a look at kind test failures? from-cidr-host-netns/from-cidr-to-pod/host-netns-to-pod seems to be failing consistently.

Fixed!

viktor-kurchenko avatar May 07 '24 05:05 viktor-kurchenko

yeah any number greater than 1 is fine 🚀🙏

Done. Also, I've added this ugly thing due to the annoying error Have you seen it before?

viktor-kurchenko avatar May 07 '24 05:05 viktor-kurchenko

Also, I've added this ugly thing due to the annoying error Have you seen it before?

no i haven't seen it before, we probably shouldn't do || true to ignore the error. it might be better to increase the number of workers in a separate pull request so that this pull request doesn't keep growing 💭

michi-covalent avatar May 07 '24 06:05 michi-covalent

no i haven't seen it before, we probably shouldn't do || true to ignore the error. it might be better to increase the number of workers in a separate pull request so that this pull request doesn't keep growing 💭

So, should I revert the CI changes for now?

viktor-kurchenko avatar May 07 '24 06:05 viktor-kurchenko

So, should I revert the CI changes for now?

you can keep this pr as is for now. just open a separate pr to increase the number of workers, and we can figure out what's going on there.

michi-covalent avatar May 07 '24 14:05 michi-covalent

you can keep this pr as is for now. just open a separate pr to increase the number of workers, and we can figure out what's going on there.

But without node increase it won't be possible to add CI matrix for concurrent run. Even for the --test-concurrency=2 it requires at least 3 cilium nodes.

viktor-kurchenko avatar May 07 '24 19:05 viktor-kurchenko