Improve e2e tests to collect context informations
What is the problem you're trying to solve
Currently if a test fails, we don't have any informations about the failure. Why it happened, and how. We just see error messages in the ginkgo output.
This makes debugging flaky tests like the JobSeq really hard: https://github.com/volcano-sh/volcano/issues/4732
The only output we have is a kind export if the test failed: kind export in generate-log function
Which are container logs after the suite has finished.
Describe the solution you'd like
It would be good if would be able to dump the test-context in the artifacts path. The directory name could be the namespace's name.
By Test Context I understand:
- pods
- podgroups
- queues
- priorityClasses
- VCJobs
- VCronjobs
- Standard kubernetes jobs/cronjobs/deployments/statefulsets
It can be a separate TestContext function used in a JustAfterEach - CurrentSpecReport().Failed() block.
Example from Ginkgo's documentation:
https://onsi.github.io/ginkgo/#separating-diagnostics-collection-and-teardown-justaftereach
/good-first-issue /area test
@hajnalmt: This request has been marked as suitable for new contributors.
Please ensure the request meets the requirements listed here.
If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-good-first-issue command.
In response to this:
/good-first-issue /area test
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/assign
hi @hajnalmt
I'm planning to solve this issue #4764
My approach would be, like:
- Add a
DumpTestContext()function intest/e2e/util/util.goto collect and dump Kubernetes resources (Pods, PodGroups, Queues, VCJobs, VCronJobs, K8s Jobs/CronJobs/Deployments/StatefulSets) for a namespace. - Using Ginkgo's
JustAfterEachwithCurrentSpecReport().Failed()to trigger dumping only on test failures. - And Save YAML files to
ARTIFACTS_PATH/{namespace}/for easy inspection.
it will help us to capture the cluster state when tests fail, making debugging easier, let me know if I can implement this
I'm testing this locally with existing codebase, it's running my local:
hi @hajnalmt
I'm planning to solve this issue #4764
My approach would be, like:
- Add a
DumpTestContext()function intest/e2e/util/util.goto collect and dump Kubernetes resources (Pods, PodGroups, Queues, VCJobs, VCronJobs, K8s Jobs/CronJobs/Deployments/StatefulSets) for a namespace.- Using Ginkgo's
JustAfterEachwithCurrentSpecReport().Failed()to trigger dumping only on test failures.- And Save YAML files to
ARTIFACTS_PATH/{namespace}/for easy inspection.it will help us to capture the cluster state when tests fail, making debugging easier, let me know if I can implement this
Superb! The approach looks good. Thank you for picking this up.
Superb! The approach looks good. Thank you for picking this up.
sure, I'll implement this, any suggestion contributing first time for volcano
hi @hajnalmt , i've raised PR, would be great if you can review #4767