eks-anywhere icon indicating copy to clipboard operation
eks-anywhere copied to clipboard

Continue e2e test after failure when checkpoint is enabled

Open taneyland opened this issue 3 years ago • 2 comments

Issue #, if available:

Description of changes: For checkpoint tests, the first run of upgrade cluster will fail. If we return the error in the framework, this will signal the e2e test to end there. So if the checkpoint feature is enabled, we won't return the error in the framework. The error will still happen in the logs as usual, but the e2e test will continue, allowing us to accurately test the checkpoint feature.

Example output after the first upgrade failure with these changes:

2022-08-05T17:26:04.543Z	V4	Task finished	{"task_name": "collect-cluster-diagnostics", "duration": "2m41.519058791s"}
2022-08-05T17:26:04.543Z	V4	----------------------------------
2022-08-05T17:26:04.543Z	V4	Saving checkpoint	{"file": "eksa-test-5d81cdf-checkpoint.yaml"}
2022-08-05T17:26:04.544Z	V4	Tasks completed	{"duration": "4m48.184183815s"}
2022-08-05T17:26:04.544Z	V3	Cleaning up long running container	{"name": "eksa_1659720075911723793"}
Error: failed to upgrade cluster: waiting for external etcd for workload cluster to be ready: executing wait: error: timed out waiting for the condition on clusters/eksa-test-5d81cdf

    cluster.go:646: Running shell command [ eksctl anywhere upgrade cluster -f eksa-test-5d81cdf/cluster.yaml -v 4 ]
2022-08-05T17:26:04.816Z	V4	Logger init completed	{"vlevel": 4}
2022-08-05T17:26:04.989Z	V4	Reading bundles manifest
2022-08-05T17:26:05.064Z	V2	Pulling docker image
2022-08-05T17:26:05.294Z	V3	Initializing long running container
2022-08-05T17:26:05.483Z	V4	Checkpoint feature enabled
2022-08-05T17:26:05.483Z	V4	Reading checkpoint	{"file": "eksa-test-5d81cdf/generated/eksa-test-5d81cdf-checkpoint.yaml"}
2022-08-05T17:26:05.483Z	V4	Restoring task	{"task_name": "setup-and-validate"}
2022-08-05T17:26:05.483Z	V0	docker Provider setup is valid
2022-08-05T17:26:06.948Z	V4	Restoring task	{"task_name": "update-secrets"}
2022-08-05T17:26:06.948Z	V4	Restoring task	{"task_name": "ensure-etcd-capi-components-exist"}
...

Testing (if applicable):

Tested with docker checkpoint e2e test

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

taneyland avatar Aug 05 '22 18:08 taneyland

Codecov Report

Merging #2893 (ef2cd12) into main (dba9ff4) will increase coverage by 0.00%. The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #2893   +/-   ##
=======================================
  Coverage   62.24%   62.25%           
=======================================
  Files         334      334           
  Lines       26849    26865   +16     
=======================================
+ Hits        16713    16724   +11     
- Misses       8854     8857    +3     
- Partials     1282     1284    +2     
Impacted Files Coverage Δ
pkg/networking/cilium/reconciler/reconciler.go 78.94% <0.00%> (-5.27%) :arrow_down:
pkg/workflows/delete.go 59.73% <0.00%> (+0.27%) :arrow_up:
pkg/providers/snow/reconciler/reconciler.go 84.21% <0.00%> (+84.21%) :arrow_up:

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

codecov[bot] avatar Aug 05 '22 19:08 codecov[bot]

Are your unit tests succeeding locally?

vivek-koppuru avatar Aug 19 '22 00:08 vivek-koppuru

I'm unable to test vsphere e2e tests locally but I have the same test locally using Docker provider & it passes

taneyland avatar Aug 19 '22 15:08 taneyland

[APPROVALNOTIFIER] This PR is APPROVED

Approval requirements bypassed by manually added approval.

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

eks-distro-bot avatar Aug 19 '22 20:08 eks-distro-bot

/cherrypick release-0.11

taneyland avatar Aug 22 '22 14:08 taneyland

@taneyland: new pull request created: #3095

In response to this:

/cherrypick release-0.11

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

eks-distro-pr-bot avatar Aug 22 '22 14:08 eks-distro-pr-bot