kapp-controller icon indicating copy to clipboard operation
kapp-controller copied to clipboard

❄️ Flaky Test Collection ❄️

Open joe-kimmel-vmw opened this issue 3 years ago • 12 comments

Flaky Test Collection: Name and shame flaky or flakey tests in this issue, provide any ongoing hints/remediation, remove,

  1. Test_PackageInstalled_FromPackageInstall_DeletionFailureBlocks (recently got new diagnostic prints added to help debug why it flakes)
  2. wait: no child processes [1] [2] [3] [4]

joe-kimmel-vmw avatar Jun 17 '22 17:06 joe-kimmel-vmw

Test_PackageInstallAndRepo_CanAuthenticateToPrivateRepository_UsingPlaceholderSecret was updated in https://github.com/vmware-tanzu/carvel-kapp-controller/pull/738, may be worth checking if that was the cause of the flakiness

benmoss avatar Jun 17 '22 17:06 benmoss

~~I'm investigating Test_PackageInstallAndRepo_CanAuthenticateToPrivateRepository_UsingPlaceholderSecret~~

@cppforlife fixed in https://github.com/vmware-tanzu/carvel-kapp-controller/pull/758

benmoss avatar Jun 17 '22 19:06 benmoss

"Fetching resources: wait: no child processes" error from:

  • https://github.com/vmware-tanzu/carvel-kapp-controller/runs/6923327841?check_suite_focus=true#step:5:4575
  • https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7014548810?check_suite_focus=true#step:5:4402

--- FAIL: Test_PackageInstallStatus_DisplaysUsefulErrorMessage_ForDeploymentFailure (11.16s)
    packageinstall_test.go:268: 
        Expected useful error message to contain deploy error
        Got:
        Fetching resources: wait: no child processes
# from text examples
kapp deploy -y -a nginx-helm-git -f examples/nginx-helm-git.yml

cppforlife avatar Jun 23 '22 16:06 cppforlife

Fetching resources: wait: no child processes

  • https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7072116629?check_suite_focus=true#step:5:4489
# from examples
kapp deploy -y -a redis-helm -f examples/redis-helm.yml

praveenrewar avatar Jun 27 '22 12:06 praveenrewar

(https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7234570328?check_suite_focus=true)

--- FAIL: Test_PackageInstalled_FromPackageInstall_Successfully (6.51s)
    packageinstall_test.go:172: 
        	Error Trace:	packageinstall_test.go:172
        	Error:      	Not equal: 
        	            	expected: v1alpha1.AppStatus{ManagedAppName:"", Fetch:(*v1alpha1.AppStatusFetch)(0xc000331ea0), Template:(*v1alpha1.AppStatusTemplate)(0xc000344f80), Deploy:(*v1alpha1.AppStatusDeploy)(0xc000331e30), Inspect:(*v1alpha1.AppStatusInspect)(0xc0001384b0), ConsecutiveReconcileSuccesses:1, ConsecutiveReconcileFailures:0, GenericStatus:v1alpha1.GenericStatus{ObservedGeneration:1, Conditions:[]v1alpha1.Condition{v1alpha1.Condition{Type:"ReconcileSucceeded", Status:"True", Reason:"", Message:""}}, FriendlyDescription:"Reconcile succeeded", UsefulErrorMessage:""}}
        	            	actual  : v1alpha1.AppStatus{ManagedAppName:"", Fetch:(*v1alpha1.AppStatusFetch)(0xc000331dc0), Template:(*v1alpha1.AppStatusTemplate)(0xc000344900), Deploy:(*v1alpha1.AppStatusDeploy)(0xc000331d50), Inspect:(*v1alpha1.AppStatusInspect)(0xc000138410), ConsecutiveReconcileSuccesses:2, ConsecutiveReconcileFailures:0, GenericStatus:v1alpha1.GenericStatus{ObservedGeneration:1, Conditions:[]v1alpha1.Condition{v1alpha1.Condition{Type:"ReconcileSucceeded", Status:"True", Reason:"", Message:""}}, FriendlyDescription:"Reconcile succeeded", UsefulErrorMessage:""}}
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -68,3 +68,3 @@
        	            	  }),
        	            	- ConsecutiveReconcileSuccesses: (int) 1,
        	            	+ ConsecutiveReconcileSuccesses: (int) 2,
        	            	  ConsecutiveReconcileFailures: (int) 0,
        	Test:       	Test_PackageInstalled_FromPackageInstall_Successfully

cppforlife avatar Jul 07 '22 15:07 cppforlife

Test_PackageInstall_UsesExistingAppWithSameName was fixed by #783

cppforlife avatar Jul 11 '22 16:07 cppforlife

TestPackageRepository https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7339526938?check_suite_focus=true#step:5:4533

The kctrl logic in https://github.com/benmoss/carvel-kapp-controller/blob/1815114580ddbe149abf2f0cf5309c3163bae9d1/cli/pkg/kctrl/cmd/app/app_tailer.go#L202 seems sane, I'm not sure how this failure is happening.

Running 'kctrl package repository update -r test-package-repository --url index.docker.io/k8slt/kc-e2e-test-repo:latest -n kctrl-test --yes'...
    package_repository_test.go:118: 
        	Error Trace:	package_repository_test.go:118
        	            				e2e.go:14
        	            				package_repository_test.go:110
        	Error:      	"Target cluster 'https://192.168.49.2:8443' (nodes: minikube)

Waiting for package repository to be updated

12:19:15PM: Waiting for package repository reconciliation for 'test-package-repository'
12:19:23PM: Waiting for generation 2 to be observed 
12:19:23PM: Fetch started 
12:19:23PM: Template succeeded 
12:19:23PM: Deploy started (1s ago)
12:19:24PM: Deploying 
	    | Target cluster 'https://10.96.0.1:443'
	    | 12:19:24PM: info: Resources: Scoping listings to single namespace: kctrl-test
	    | Changes
	    | Namespace   Name                            Kind             Age  Op      Op st.  Wait to  Rs  Ri
	    | kctrl-test  pkg.test.carvel.dev             PackageMetadata  -    create  ???     -        -   -
	    | ^           pkg.test.carvel.dev.1.0.0       Package          -    create  ???     -        -   -
	    | ^           pkg.test.carvel.dev.2.0.0       Package          -    create  ???     -        -   -
	    | ^           pkg.test.carvel.dev.3.0.0-rc.1  Package          -    create  ???     -        -   -
	    | Op:      4 create, 0 delete, 0 update, 0 noop, 0 exists
	    | Wait to: 0 reconcile, 0 delete, 4 noop
	    | 12:19:24PM: ---- applying 4 changes [0/4 done] ----
	    | 12:19:24PM: create package/pkg.test.carvel.dev.1.0.0 (data.packaging.carvel.dev/v1alpha1) namespace: kctrl-test
	    | 12:19:24PM: create package/pkg.test.carvel.dev.3.0.0-rc.1 (data.packaging.carvel.dev/v1alpha1) namespace: kctrl-test
	    | 12:19:24PM: create package/pkg.test.carvel.dev.2.0.0 (data.packaging.carvel.dev/v1alpha1) namespace: kctrl-test
	    | 12:19:24PM: create packagemetadata/pkg.test.carvel.dev (data.packaging.carvel.dev/v1alpha1) namespace: kctrl-test
	    | 12:19:24PM: ---- waiting on 4 changes [0/4 done] ----
	    | 12:19:24PM: ok: noop packagemetadata/pkg.test.carvel.dev (data.packaging.carvel.dev/v1alpha1) namespace: kctrl-test
	    | 12:19:24PM: ok: noop package/pkg.test.carvel.dev.1.0.0 (data.packaging.carvel.dev/v1alpha1) namespace: kctrl-test
	    | 12:19:24PM: ok: noop package/pkg.test.carvel.dev.3.0.0-rc.1 (data.packaging.carvel.dev/v1alpha1) namespace: kctrl-test
	    | 12:19:24PM: ok: noop package/pkg.test.carvel.dev.2.0.0 (data.packaging.carvel.dev/v1alpha1) namespace: kctrl-test
	    | 12:19:24PM: ---- applying complete [4/4 done] ----
	    | 12:19:24PM: ---- waiting complete [4/4 done] ----
	    | Succeeded
12:19:24PM: Deploy succeeded 

Succeeded
" does not contain "Fetch succeeded"
        	Test:       	TestPackageRepository

        	Test:       	TestPackageRepository

benmoss avatar Jul 14 '22 19:07 benmoss

--- FAIL: TestDependencyDownload (6.40s)
    --- FAIL: TestDependencyDownload/with_regular_files (1.01s)
        dependencies_test.go:73: bad status code retrieving url: https://github.com/benmoss/test-resources/releases/download/v1.0.0/test-v1.0.0-darwin-arm64: 503 Service Unavailable
    --- FAIL: TestDependencyDownload/with_tgz_files (5.39s)
        dependencies_test.go:73: bad status code retrieving url: https://github.com/benmoss/test-resources/releases/download/v1.0.0/test-v1.0.0-darwin-arm64.tgz: 503 Service Unavailable
2022/07/22 12:22:40 Updating test to 1.0.1
--- FAIL: TestDependencyUpdate (5.80s)
    dependencies_test.go:119: bad status code retrieving url: https://github.com/benmoss/test-resources/releases/download/v1.0.1/test-v1.0.1-darwin-arm64: 503 Service Unavailable

joe-kimmel-vmw avatar Jul 25 '22 18:07 joe-kimmel-vmw

--- FAIL: Test_AppReconcileOccurs_WhenSecretUpdated (2.54s)
[4323](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4324)
    kapp.go:92: Failed to successfully execute 'kapp deploy -f - -a configmap-with-secret -n kappctrl-test --yes': Execution error: stdout: 'Target cluster 'https://192.168.49.2:8443/' (nodes: minikube)
[4324](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4325)
        
[4325](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4326)
        Changes
[4326](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4327)
        
[4327](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4328)
        Namespace      Name                          Kind            Age  Op      Op st.  Wait to    Rs  Ri  
[4328](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4329)
        kappctrl-test  configmap-with-secret         App             -    create  -       reconcile  -   -  
[4329](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4330)
        ^              kappctrl-e2e-ns-role          Role            -    create  -       reconcile  -   -  
[4330](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4331)
        ^              kappctrl-e2e-ns-role-binding  RoleBinding     -    create  -       reconcile  -   -  
[4331](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4332)
        ^              kappctrl-e2e-ns-sa            ServiceAccount  -    create  -       reconcile  -   -  
[4332](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4333)
        ^              simple-app-values             Secret          -    create  -       reconcile  -   -  
[4333](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4334)
        
[4334](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4335)
        Op:      5 create, 0 delete, 0 update, 0 noop, 0 exists
[4335](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4336)
        Wait to: 5 reconcile, 0 delete, 0 noop
[4336](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4337)
        
[4337](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4338)
        8:34:15PM: ---- applying 3 changes [0/5 done] ----
[4338](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4339)
        8:34:15PM: create role/kappctrl-e2e-ns-role (rbac.authorization.k8s.io/v1) namespace: kappctrl-test
[4339](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4340)
        8:34:15PM: create secret/simple-app-values (v1) namespace: kappctrl-test
[4340](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4341)
        8:34:15PM: create serviceaccount/kappctrl-e2e-ns-sa (v1) namespace: kappctrl-test



8:34:15PM: ---- waiting on 3 changes [0/5 done] ----
[4342](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4343)
        8:34:15PM: ok: reconcile secret/simple-app-values (v1) namespace: kappctrl-test
[4343](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4344)
        8:34:15PM: ok: reconcile serviceaccount/kappctrl-e2e-ns-sa (v1) namespace: kappctrl-test
[4344](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4345)
        8:34:15PM: ok: reconcile role/kappctrl-e2e-ns-role (rbac.authorization.k8s.io/v1) namespace: kappctrl-test
[4345](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4346)
        8:34:15PM: ---- applying 1 changes [3/5 done] ----
[4346](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4347)
        8:34:15PM: create rolebinding/kappctrl-e2e-ns-role-binding (rbac.authorization.k8s.io/v1) namespace: kappctrl-test
[4347](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4348)
        8:34:15PM: ---- waiting on 1 changes [3/5 done] ----
[4348](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4349)
        8:34:15PM: ok: reconcile rolebinding/kappctrl-e2e-ns-role-binding (rbac.authorization.k8s.io/v1) namespace: kappctrl-test
[4349](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4350)
        8:34:15PM: ---- applying 1 changes [4/5 done] ----
[4350](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4351)
        8:34:15PM: create app/configmap-with-secret (kappctrl.k14s.io/v1alpha1) namespace: kappctrl-test
[4351](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4352)
        8:34:15PM: ---- waiting on 1 changes [4/5 done] ----
[4352](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4353)
        8:34:15PM: ongoing: reconcile app/configmap-with-secret (kappctrl.k14s.io/v1alpha1) namespace: kappctrl-test
[4353](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4354)
        8:34:15PM:  ^ Waiting for generation 1 to be observed
[4354](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4355)
        8:34:16PM: fail: reconcile app/configmap-with-secret (kappctrl.k14s.io/v1alpha1) namespace: kappctrl-test
[4355](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4356)
        8:34:16PM:  ^ Reconcile failed:  (message: Templating dir: waitid: no child processes)
[4356](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4357)
        
[4357](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4358)
        ' stderr: 'kapp: Error: waiting on reconcile app/configmap-with-secret (kappctrl.k14s.io/v1alpha1) namespace: kappctrl-test:
[4358](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4359)
          Finished unsuccessfully (Reconcile failed:  (message: Templating dir: waitid: no child processes))
[4359](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4360)
        ' error: 'exit status 1'
[4360](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4361)
Running 'kapp delete -a configmap-with-configmap -n kappctrl-test --yes'...
[4361](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4362)
==> deploy












===========


--- FAIL: Test_PackageInstalled_FromPackageInstall_DeletionFailureBlocks (2.84s)
[4659](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4660)
    kapp.go:92: Failed to successfully execute 'kapp deploy -a instl-pkg-failure-block-test -f - -n kappctrl-test --yes': Execution error: stdout: 'Target cluster 'https://192.168.49.2:8443/' (nodes: minikube)
[4660](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4661)
        
[4661](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4662)
        Changes
[4662](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4663)
        
[4663](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4664)
        Namespace      Name                          Kind               Age  Op      Op st.  Wait to    Rs  Ri  
[4664](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4665)
        kappctrl-test  basic.test.carvel.dev         PackageRepository  -    create  -       reconcile  -   -  
[4665](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4666)
        ^              instl-pkg-failure-block-test  PackageInstall     -    create  -       reconcile  -   -  
[4666](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4667)
        ^              kappctrl-e2e-ns-role          Role               -    create  -       reconcile  -   -  
[4667](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4668)
        ^              kappctrl-e2e-ns-role-binding  RoleBinding        -    create  -       reconcile  -   -  
[4668](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4669)
        ^              kappctrl-e2e-ns-sa            ServiceAccount     -    create  -       reconcile  -   -  
[4669](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4670)
        
[4670](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4671)
        Op:      5 create, 0 delete, 0 update, 0 noop, 0 exists
[4671](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4672)
        Wait to: 5 reconcile, 0 delete, 0 noop
[4672](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4673)
        
[4673](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4674)
        9:10:17PM: ---- applying 2 changes [0/5 done] ----
[4674](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4675)
        9:10:17PM: create role/kappctrl-e2e-ns-role (rbac.authorization.k8s.io/v1) namespace: kappctrl-test
[4675](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4676)
        9:10:17PM: create serviceaccount/kappctrl-e2e-ns-sa (v1) namespace: kappctrl-test
[4676](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4677)
        9:10:17PM: ---- waiting on 2 changes [0/5 done] ----
[4677](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4678)
        9:10:17PM: ok: reconcile serviceaccount/kappctrl-e2e-ns-sa (v1) namespace: kappctrl-test
[4678](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4679)
        9:10:17PM: ok: reconcile role/kappctrl-e2e-ns-role (rbac.authorization.k8s.io/v1) namespace: kappctrl-test
[4679](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4680)
        9:10:17PM: ---- applying 1 changes [2/5 done] ----
[4680](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4681)
        9:10:17PM: create rolebinding/kappctrl-e2e-ns-role-binding (rbac.authorization.k8s.io/v1) namespace: kappctrl-test
[4681](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4682)
        9:10:17PM: ---- waiting on 1 changes [2/5 done] ----
[4682](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4683)
        9:10:17PM: ok: reconcile rolebinding/kappctrl-e2e-ns-role-binding (rbac.authorization.k8s.io/v1) namespace: kappctrl-test
[4683](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4684)
        9:10:17PM: ---- applying 1 changes [3/5 done] ----
[4684](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4685)
        9:10:17PM: create packagerepository/basic.test.carvel.dev (packaging.carvel.dev/v1alpha1) namespace: kappctrl-test
[4685](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4686)
        9:10:17PM: ---- waiting on 1 changes [3/5 done] ----
[4686](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4687)
        9:10:17PM: ongoing: reconcile packagerepository/basic.test.carvel.dev (packaging.carvel.dev/v1alpha1) namespace: kappctrl-test
[4687](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4688)
        9:10:17PM:  ^ Waiting for generation 1 to be observed
[4688](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4689)
        9:10:18PM: ongoing: reconcile packagerepository/basic.test.carvel.dev (packaging.carvel.dev/v1alpha1) namespace: kappctrl-test
[4689](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4690)
        9:10:18PM:  ^ Reconciling
[4690](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4691)
        9:10:19PM: fail: reconcile packagerepository/basic.test.carvel.dev (packaging.carvel.dev/v1alpha1) namespace: kappctrl-test
[4691](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4692)
        9:10:19PM:  ^ Reconcile failed:  (message: Templating dir: waitid: no child processes)
[4692](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4693)
        
[4693](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4694)
        ' stderr: 'kapp: Error: waiting on reconcile packagerepository/basic.test.carvel.dev (packaging.carvel.dev/v1alpha1) namespace: kappctrl-test:
[4694](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4695)
          Finished unsuccessfully (Reconcile failed:  (message: Templating dir: waitid: no child processes))
[4695](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4696)
        ' error: 'exit status 1'

joe-kimmel-vmw avatar Aug 08 '22 21:08 joe-kimmel-vmw

==> deploy
Running 'kapp deploy -f - -a test-repo-status-success -n kappctrl-test --yes --wait-timeout 3m'...
==> check against expected successful status
Running 'kapp inspect -a test-repo-status-success --raw --tty=false --filter-kind=PackageRepository -n kappctrl-test --yes'...
==> force a second reconcile and see if it all still works
Running 'kapp deploy -f - -a test-repo-status-success -n kappctrl-test --yes --wait-timeout 3m'...
Running 'kapp delete -a test-repo-status-success -n kappctrl-test --yes'...
==> deploy pkg repository
Running 'kapp deploy -a repo-packages-available -f - -n kappctrl-test --yes --wait-timeout 3m'...
Running 'kapp delete -a repo-packages-available -n kappctrl-test --yes'...
--- FAIL: Test_PackageRepoBundle_PackagesAvailable (1.56s)
    kapp.go:95: Failed to successfully execute 'kapp deploy -a repo-packages-available -f - -n kappctrl-test --yes --wait-timeout 3m': Execution error: stdout: 'Target cluster 'https://192.168.49.2:8443/' (nodes: minikube)
        
        Changes
        
        Namespace      Name                   Kind               Age  Op      Op st.  Wait to    Rs  Ri  
        kappctrl-test  basic.test.carvel.dev  PackageRepository  -    create  -       reconcile  -   -  
        
        Op:      1 create, 0 delete, 0 update, 0 noop, 0 exists
        Wait to: 1 reconcile, 0 delete, 0 noop
        
        12:03:01AM: ---- applying 1 changes [0/1 done] ----
        12:03:01AM: create packagerepository/basic.test.carvel.dev (packaging.carvel.dev/v1alpha1) namespace: kappctrl-test
        12:03:01AM: ---- waiting on 1 changes [0/1 done] ----
        12:03:01AM: ongoing: reconcile packagerepository/basic.test.carvel.dev (packaging.carvel.dev/v1alpha1) namespace: kappctrl-test
        12:03:01AM:  ^ Waiting for generation 1 to be observed
        12:03:02AM: fail: reconcile packagerepository/basic.test.carvel.dev (packaging.carvel.dev/v1alpha1) namespace: kappctrl-test
        12:03:02AM:  ^ Reconcile failed:  (message: Fetching: secrets "basic.test.carvel.dev-fetch-0" not found)
        
        ' stderr: 'kapp: Error: waiting on reconcile packagerepository/basic.test.carvel.dev (packaging.carvel.dev/v1alpha1) namespace: kappctrl-test:
          Finished unsuccessfully (Reconcile failed:  (message: Fetching: secrets "basic.test.carvel.dev-fetch-0" not found))
        ' error: 'exit status 1'

cppforlife avatar Sep 16 '22 12:09 cppforlife

Ran into a failure for Test_PackageInstalled_FromPackageInstall_DeletionFailureBlocks here where it seems like the deletion never fails.

100mik avatar Sep 20 '22 08:09 100mik

Investigation of wait: no child processes flake

When does it happen?

This seems to be completely random

Why does it happen?

I attempted to perform a root cause analysis, but to not much success. I tried:

  • re-running the e2e tests over 100 times against minikube and kind locally on a mac OSX.
  • Forked the repo, running the e2e tests via the github action runner which uses kind on ubuntu-latest.
  • Attempts to reproduce were unsuccessful.

What next?

  • We suspect, though cannot confirm, that this may be a race condition in the way we start commands and our zombie reaping process that runs constantly. When we kick off a cmd.Run() during our templating phase, this reapzombies call can happen right between the start() and the wait() thus causing this error?

Quote from a similar issue as referenced below:

I've run into this intermittently. The code section in question is in utils/run.go ExecuteAndWait. If you check out the golang source code for cmd.Run you'll see a race condition. The process is started and then we wait for it. But if the process completes and exits before the wait happens (because, say, the go runtime decides to do a GC pause right then or the goroutine yields for the syscall), then we'll get an error there.

References:

neil-hickey avatar Oct 11 '22 21:10 neil-hickey

Another wait: no child processes on a failing Test_SecretsAndConfigMapsWithCustomPathsCanReconcile here

cc: @neil-hickey in case it helps

100mik avatar Oct 18 '22 07:10 100mik

Seeing a lot of flake in the case where we expect consecutive successes to be 1 but it is 2. (Example on Test_PackageRepoStatus_Success)

Maybe increasing sync period will help ensure that another reconciliation does not happen before we check for the case? Edit: Again over here

100mik avatar Nov 24 '22 07:11 100mik