agones icon indicating copy to clipboard operation
agones copied to clipboard

Flaky: submit-upgrade-test-cloud-build

Open markmandel opened this issue 8 months ago • 4 comments

This test fails occasionally, with the following output.

Log Already have image (with digest): gcr.io/google.com/cloudsdktool/cloud-sdk Fetching cluster endpoint and auth data. kubeconfig entry generated for standard-upgrade-test-cluster-1-31. Checking if resources from a previous build of upgrade-test-runner exist and need to be cleaned up on cluster standard-upgrade-test-cluster-1-31. No resources found in default namespace. No resources found in default namespace. No resources found in default namespace. fleetautoscalers.autoscaling.agones.dev 2025-04-22T22:45:05Z fleets.agones.dev 2025-04-22T22:45:05Z gameserverallocationpolicies.multicluster.agones.dev 2025-04-22T22:45:06Z gameservers.agones.dev 2025-04-22T22:45:06Z gameserversets.agones.dev 2025-04-22T22:45:06Z Deleting crds from previous run of upgrade-test-runner on cluster standard-upgrade-test-cluster-1-31. customresourcedefinition.apiextensions.k8s.io "fleetautoscalers.autoscaling.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "fleets.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "gameserverallocationpolicies.multicluster.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "gameservers.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "gameserversets.agones.dev" deleted kubectl apply -f permissions.yaml on cluster standard-upgrade-test-cluster-1-31 serviceaccount/agones-sa unchanged role.rbac.authorization.k8s.io/pod-manager unchanged rolebinding.rbac.authorization.k8s.io/manage-pods unchanged clusterrole.rbac.authorization.k8s.io/node-reader unchanged clusterrolebinding.rbac.authorization.k8s.io/read-nodes unchanged priorityclass.scheduling.k8s.io/high-priority configured clusterrole.rbac.authorization.k8s.io/namespace-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-namespaces unchanged clusterrole.rbac.authorization.k8s.io/secret-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-secrets unchanged clusterrole.rbac.authorization.k8s.io/priorityclass-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-priorityclasses unchanged clusterrole.rbac.authorization.k8s.io/poddisruptionbudget-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-poddisruptionbudgets unchanged clusterrole.rbac.authorization.k8s.io/serviceaccount-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-serviceaccounts unchanged clusterrole.rbac.authorization.k8s.io/apiservices-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-apiservices unchanged clusterrole.rbac.authorization.k8s.io/crd-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-crds unchanged clusterrole.rbac.authorization.k8s.io/clusterrole-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manager-clusterroles unchanged clusterrole.rbac.authorization.k8s.io/deployment-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-deployments unchanged clusterrole.rbac.authorization.k8s.io/webhookconfiguration-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-webhookconfigurations unchanged clusterrole.rbac.authorization.k8s.io/allocator unchanged clusterrole.rbac.authorization.k8s.io/controller unchanged clusterrole.rbac.authorization.k8s.io/sdk unchanged clusterrolebinding.rbac.authorization.k8s.io/allocator unchanged clusterrolebinding.rbac.authorization.k8s.io/controller-access unchanged clusterrolebinding.rbac.authorization.k8s.io/controller:system:auth-delegator unchanged rolebinding.rbac.authorization.k8s.io/controller-auth-reader unchanged rolebinding.rbac.authorization.k8s.io/sdk-access unchanged clusterrole.rbac.authorization.k8s.io/helm-cleanup unchanged clusterrolebinding.rbac.authorization.k8s.io/helm-cleanup-access unchanged kubectl apply -f versionMap.yaml on cluster standard-upgrade-test-cluster-1-31 configmap/version-map configured kubectl apply -f gameserverTemplate.yaml on cluster standard-upgrade-test-cluster-1-31 configmap/gameserver-template unchanged kubectl apply -f upgradeTest.yaml on cluster standard-upgrade-test-cluster-1-31 job.batch/upgrade-test-runner created pod/upgrade-test-runner-z6rb2 condition met Wait for job upgrade-test-runner to complete or fail on cluster standard-upgrade-test-cluster-1-31 Fetching cluster endpoint and auth data. kubeconfig entry generated for standard-upgrade-test-cluster-1-30. Checking if resources from a previous build of upgrade-test-runner exist and need to be cleaned up on cluster standard-upgrade-test-cluster-1-30. No resources found in default namespace. No resources found in default namespace. No resources found in default namespace. fleetautoscalers.autoscaling.agones.dev 2025-04-22T22:45:25Z fleets.agones.dev 2025-04-22T22:45:25Z gameserverallocationpolicies.multicluster.agones.dev 2025-04-22T22:45:25Z gameservers.agones.dev 2025-04-22T22:45:25Z gameserversets.agones.dev 2025-04-22T22:45:26Z Deleting crds from previous run of upgrade-test-runner on cluster standard-upgrade-test-cluster-1-30. customresourcedefinition.apiextensions.k8s.io "fleetautoscalers.autoscaling.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "fleets.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "gameserverallocationpolicies.multicluster.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "gameservers.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "gameserversets.agones.dev" deleted kubectl apply -f permissions.yaml on cluster standard-upgrade-test-cluster-1-30 serviceaccount/agones-sa unchanged role.rbac.authorization.k8s.io/pod-manager unchanged rolebinding.rbac.authorization.k8s.io/manage-pods unchanged clusterrole.rbac.authorization.k8s.io/node-reader unchanged clusterrolebinding.rbac.authorization.k8s.io/read-nodes unchanged priorityclass.scheduling.k8s.io/high-priority configured clusterrole.rbac.authorization.k8s.io/namespace-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-namespaces unchanged clusterrole.rbac.authorization.k8s.io/secret-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-secrets unchanged clusterrole.rbac.authorization.k8s.io/priorityclass-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-priorityclasses unchanged clusterrole.rbac.authorization.k8s.io/poddisruptionbudget-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-poddisruptionbudgets unchanged clusterrole.rbac.authorization.k8s.io/serviceaccount-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-serviceaccounts unchanged clusterrole.rbac.authorization.k8s.io/apiservices-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-apiservices unchanged clusterrole.rbac.authorization.k8s.io/crd-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-crds unchanged clusterrole.rbac.authorization.k8s.io/clusterrole-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manager-clusterroles unchanged clusterrole.rbac.authorization.k8s.io/deployment-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-deployments unchanged clusterrole.rbac.authorization.k8s.io/webhookconfiguration-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-webhookconfigurations unchanged clusterrole.rbac.authorization.k8s.io/allocator unchanged clusterrole.rbac.authorization.k8s.io/controller unchanged clusterrole.rbac.authorization.k8s.io/sdk unchanged clusterrolebinding.rbac.authorization.k8s.io/allocator unchanged clusterrolebinding.rbac.authorization.k8s.io/controller-access unchanged clusterrolebinding.rbac.authorization.k8s.io/controller:system:auth-delegator unchanged rolebinding.rbac.authorization.k8s.io/controller-auth-reader unchanged rolebinding.rbac.authorization.k8s.io/sdk-access unchanged clusterrole.rbac.authorization.k8s.io/helm-cleanup unchanged clusterrolebinding.rbac.authorization.k8s.io/helm-cleanup-access unchanged kubectl apply -f versionMap.yaml on cluster standard-upgrade-test-cluster-1-30 configmap/version-map configured kubectl apply -f gameserverTemplate.yaml on cluster standard-upgrade-test-cluster-1-30 configmap/gameserver-template unchanged kubectl apply -f upgradeTest.yaml on cluster standard-upgrade-test-cluster-1-30 job.batch/upgrade-test-runner created pod/upgrade-test-runner-rk494 condition met Wait for job upgrade-test-runner to complete or fail on cluster standard-upgrade-test-cluster-1-30 Fetching cluster endpoint and auth data. kubeconfig entry generated for standard-upgrade-test-cluster-1-32. Checking if resources from a previous build of upgrade-test-runner exist and need to be cleaned up on cluster standard-upgrade-test-cluster-1-32. No resources found in default namespace. No resources found in default namespace. No resources found in default namespace. fleetautoscalers.autoscaling.agones.dev 2025-04-22T22:45:44Z fleets.agones.dev 2025-04-22T22:45:44Z gameserverallocationpolicies.multicluster.agones.dev 2025-04-22T22:45:45Z gameservers.agones.dev 2025-04-22T22:45:44Z gameserversets.agones.dev 2025-04-22T22:45:45Z Deleting crds from previous run of upgrade-test-runner on cluster standard-upgrade-test-cluster-1-32. customresourcedefinition.apiextensions.k8s.io "fleetautoscalers.autoscaling.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "fleets.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "gameserverallocationpolicies.multicluster.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "gameservers.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "gameserversets.agones.dev" deleted kubectl apply -f permissions.yaml on cluster standard-upgrade-test-cluster-1-32 serviceaccount/agones-sa unchanged role.rbac.authorization.k8s.io/pod-manager unchanged rolebinding.rbac.authorization.k8s.io/manage-pods unchanged clusterrole.rbac.authorization.k8s.io/node-reader unchanged clusterrolebinding.rbac.authorization.k8s.io/read-nodes unchanged priorityclass.scheduling.k8s.io/high-priority configured clusterrole.rbac.authorization.k8s.io/namespace-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-namespaces unchanged clusterrole.rbac.authorization.k8s.io/secret-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-secrets unchanged clusterrole.rbac.authorization.k8s.io/priorityclass-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-priorityclasses unchanged clusterrole.rbac.authorization.k8s.io/poddisruptionbudget-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-poddisruptionbudgets unchanged clusterrole.rbac.authorization.k8s.io/serviceaccount-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-serviceaccounts unchanged clusterrole.rbac.authorization.k8s.io/apiservices-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-apiservices unchanged clusterrole.rbac.authorization.k8s.io/crd-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-crds unchanged clusterrole.rbac.authorization.k8s.io/clusterrole-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manager-clusterroles unchanged clusterrole.rbac.authorization.k8s.io/deployment-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-deployments unchanged clusterrole.rbac.authorization.k8s.io/webhookconfiguration-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-webhookconfigurations unchanged clusterrole.rbac.authorization.k8s.io/allocator unchanged clusterrole.rbac.authorization.k8s.io/controller unchanged clusterrole.rbac.authorization.k8s.io/sdk unchanged clusterrolebinding.rbac.authorization.k8s.io/allocator unchanged clusterrolebinding.rbac.authorization.k8s.io/controller-access unchanged clusterrolebinding.rbac.authorization.k8s.io/controller:system:auth-delegator unchanged rolebinding.rbac.authorization.k8s.io/controller-auth-reader unchanged rolebinding.rbac.authorization.k8s.io/sdk-access unchanged clusterrole.rbac.authorization.k8s.io/helm-cleanup unchanged clusterrolebinding.rbac.authorization.k8s.io/helm-cleanup-access unchanged kubectl apply -f versionMap.yaml on cluster standard-upgrade-test-cluster-1-32 configmap/version-map configured kubectl apply -f gameserverTemplate.yaml on cluster standard-upgrade-test-cluster-1-32 configmap/gameserver-template unchanged kubectl apply -f upgradeTest.yaml on cluster standard-upgrade-test-cluster-1-32 job.batch/upgrade-test-runner created pod/upgrade-test-runner-x746g condition met Wait for job upgrade-test-runner to complete or fail on cluster standard-upgrade-test-cluster-1-32 Fetching cluster endpoint and auth data. kubeconfig entry generated for gke-autopilot-upgrade-test-cluster-1-31. priorityclass.scheduling.k8s.io/low-priority configured Warning: autopilot-default-resources-mutator:Autopilot updated Deployment default/evictable-pods-deployment: adjusted 'cpu' resource to meet requirements for containers [ubuntu] (see http://g.co/gke/autopilot-defaults). deployment.apps/evictable-pods-deployment configured Checking if resources from a previous build of upgrade-test-runner exist and need to be cleaned up on cluster gke-autopilot-upgrade-test-cluster-1-31. No resources found in default namespace. No resources found in default namespace. No resources found in default namespace. fleetautoscalers.autoscaling.agones.dev 2025-04-22T22:46:13Z fleets.agones.dev 2025-04-22T22:46:12Z gameserverallocationpolicies.multicluster.agones.dev 2025-04-22T22:46:13Z gameservers.agones.dev 2025-04-22T22:46:13Z gameserversets.agones.dev 2025-04-22T22:46:13Z Deleting crds from previous run of upgrade-test-runner on cluster gke-autopilot-upgrade-test-cluster-1-31. customresourcedefinition.apiextensions.k8s.io "fleetautoscalers.autoscaling.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "fleets.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "gameserverallocationpolicies.multicluster.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "gameservers.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "gameserversets.agones.dev" deleted kubectl apply -f permissions.yaml on cluster gke-autopilot-upgrade-test-cluster-1-31 serviceaccount/agones-sa unchanged role.rbac.authorization.k8s.io/pod-manager unchanged rolebinding.rbac.authorization.k8s.io/manage-pods unchanged clusterrole.rbac.authorization.k8s.io/node-reader unchanged clusterrolebinding.rbac.authorization.k8s.io/read-nodes unchanged priorityclass.scheduling.k8s.io/high-priority configured clusterrole.rbac.authorization.k8s.io/namespace-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-namespaces unchanged clusterrole.rbac.authorization.k8s.io/secret-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-secrets unchanged clusterrole.rbac.authorization.k8s.io/priorityclass-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-priorityclasses unchanged clusterrole.rbac.authorization.k8s.io/poddisruptionbudget-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-poddisruptionbudgets unchanged clusterrole.rbac.authorization.k8s.io/serviceaccount-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-serviceaccounts unchanged clusterrole.rbac.authorization.k8s.io/apiservices-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-apiservices unchanged clusterrole.rbac.authorization.k8s.io/crd-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-crds unchanged clusterrole.rbac.authorization.k8s.io/clusterrole-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manager-clusterroles unchanged clusterrole.rbac.authorization.k8s.io/deployment-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-deployments unchanged clusterrole.rbac.authorization.k8s.io/webhookconfiguration-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-webhookconfigurations unchanged clusterrole.rbac.authorization.k8s.io/allocator unchanged clusterrole.rbac.authorization.k8s.io/controller unchanged clusterrole.rbac.authorization.k8s.io/sdk unchanged clusterrolebinding.rbac.authorization.k8s.io/allocator unchanged clusterrolebinding.rbac.authorization.k8s.io/controller-access unchanged clusterrolebinding.rbac.authorization.k8s.io/controller:system:auth-delegator unchanged rolebinding.rbac.authorization.k8s.io/controller-auth-reader unchanged rolebinding.rbac.authorization.k8s.io/sdk-access unchanged clusterrole.rbac.authorization.k8s.io/helm-cleanup unchanged clusterrolebinding.rbac.authorization.k8s.io/helm-cleanup-access unchanged kubectl apply -f versionMap.yaml on cluster gke-autopilot-upgrade-test-cluster-1-31 configmap/version-map configured kubectl apply -f gameserverTemplate.yaml on cluster gke-autopilot-upgrade-test-cluster-1-31 configmap/gameserver-template unchanged kubectl apply -f upgradeTest.yaml on cluster gke-autopilot-upgrade-test-cluster-1-31 Warning: autopilot-default-resources-mutator:Autopilot updated Job default/upgrade-test-runner: defaulted unspecified 'cpu' resource for containers [upgrade-test-controller] (see http://g.co/gke/autopilot-defaults). job.batch/upgrade-test-runner created pod/upgrade-test-runner-j5z6h condition met Wait for job upgrade-test-runner to complete or fail on cluster gke-autopilot-upgrade-test-cluster-1-31 Fetching cluster endpoint and auth data. kubeconfig entry generated for gke-autopilot-upgrade-test-cluster-1-30. priorityclass.scheduling.k8s.io/low-priority configured Warning: autopilot-default-resources-mutator:Autopilot updated Deployment default/evictable-pods-deployment: adjusted 'cpu' resource to meet requirements for containers [ubuntu] (see http://g.co/gke/autopilot-defaults). deployment.apps/evictable-pods-deployment configured Checking if resources from a previous build of upgrade-test-runner exist and need to be cleaned up on cluster gke-autopilot-upgrade-test-cluster-1-30. No resources found in default namespace. No resources found in default namespace. No resources found in default namespace. fleetautoscalers.autoscaling.agones.dev 2025-04-22T22:46:30Z fleets.agones.dev 2025-04-22T22:46:30Z gameserverallocationpolicies.multicluster.agones.dev 2025-04-22T22:46:30Z gameservers.agones.dev 2025-04-22T22:46:30Z gameserversets.agones.dev 2025-04-22T22:46:30Z Deleting crds from previous run of upgrade-test-runner on cluster gke-autopilot-upgrade-test-cluster-1-30. customresourcedefinition.apiextensions.k8s.io "fleetautoscalers.autoscaling.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "fleets.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "gameserverallocationpolicies.multicluster.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "gameservers.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "gameserversets.agones.dev" deleted kubectl apply -f permissions.yaml on cluster gke-autopilot-upgrade-test-cluster-1-30 serviceaccount/agones-sa unchanged role.rbac.authorization.k8s.io/pod-manager unchanged rolebinding.rbac.authorization.k8s.io/manage-pods unchanged clusterrole.rbac.authorization.k8s.io/node-reader unchanged clusterrolebinding.rbac.authorization.k8s.io/read-nodes unchanged priorityclass.scheduling.k8s.io/high-priority configured clusterrole.rbac.authorization.k8s.io/namespace-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-namespaces unchanged clusterrole.rbac.authorization.k8s.io/secret-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-secrets unchanged clusterrole.rbac.authorization.k8s.io/priorityclass-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-priorityclasses unchanged clusterrole.rbac.authorization.k8s.io/poddisruptionbudget-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-poddisruptionbudgets unchanged clusterrole.rbac.authorization.k8s.io/serviceaccount-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-serviceaccounts unchanged clusterrole.rbac.authorization.k8s.io/apiservices-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-apiservices unchanged clusterrole.rbac.authorization.k8s.io/crd-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-crds unchanged clusterrole.rbac.authorization.k8s.io/clusterrole-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manager-clusterroles unchanged clusterrole.rbac.authorization.k8s.io/deployment-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-deployments unchanged clusterrole.rbac.authorization.k8s.io/webhookconfiguration-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-webhookconfigurations unchanged clusterrole.rbac.authorization.k8s.io/allocator unchanged clusterrole.rbac.authorization.k8s.io/controller unchanged clusterrole.rbac.authorization.k8s.io/sdk unchanged clusterrolebinding.rbac.authorization.k8s.io/allocator unchanged clusterrolebinding.rbac.authorization.k8s.io/controller-access unchanged clusterrolebinding.rbac.authorization.k8s.io/controller:system:auth-delegator unchanged rolebinding.rbac.authorization.k8s.io/controller-auth-reader unchanged rolebinding.rbac.authorization.k8s.io/sdk-access unchanged clusterrole.rbac.authorization.k8s.io/helm-cleanup unchanged clusterrolebinding.rbac.authorization.k8s.io/helm-cleanup-access unchanged kubectl apply -f versionMap.yaml on cluster gke-autopilot-upgrade-test-cluster-1-30 configmap/version-map configured kubectl apply -f gameserverTemplate.yaml on cluster gke-autopilot-upgrade-test-cluster-1-30 configmap/gameserver-template unchanged kubectl apply -f upgradeTest.yaml on cluster gke-autopilot-upgrade-test-cluster-1-30 Warning: autopilot-default-resources-mutator:Autopilot updated Job default/upgrade-test-runner: defaulted unspecified 'cpu' resource for containers [upgrade-test-controller] (see http://g.co/gke/autopilot-defaults). job.batch/upgrade-test-runner created pod/upgrade-test-runner-gbvd9 condition met Wait for job upgrade-test-runner to complete or fail on cluster gke-autopilot-upgrade-test-cluster-1-30 Fetching cluster endpoint and auth data. kubeconfig entry generated for gke-autopilot-upgrade-test-cluster-1-32. priorityclass.scheduling.k8s.io/low-priority configured Warning: autopilot-default-resources-mutator:Autotopilot updated Deployment default/evictable-pods-deployment: adjusted 'cpu' resource to meet requirements for containers [ubuntu] (see http://g.co/gke/autopilot-defaults). deployment.apps/evictable-pods-deployment configured Checking if resources from a previous build of upgrade-test-runner exist and need to be cleaned up on cluster gke-autopilot-upgrade-test-cluster-1-32. No resources found in default namespace. No resources found in default namespace. No resources found in default namespace. fleetautoscalers.autoscaling.agones.dev 2025-04-22T22:46:53Z fleets.agones.dev 2025-04-22T22:46:53Z gameserverallocationpolicies.multicluster.agones.dev 2025-04-22T22:46:54Z gameservers.agones.dev 2025-04-22T22:46:53Z gameserversets.agones.dev 2025-04-22T22:46:54Z Deleting crds from previous run of upgrade-test-runner on cluster gke-autopilot-upgrade-test-cluster-1-32. customresourcedefinition.apiextensions.k8s.io "fleetautoscalers.autoscaling.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "fleets.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "gameserverallocationpolicies.multicluster.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "gameservers.agones.dev" deleted customresourcedefinition.apiextensions.k8s.io "gameserversets.agones.dev" deleted kubectl apply -f permissions.yaml on cluster gke-autopilot-upgrade-test-cluster-1-32 serviceaccount/agones-sa unchanged role.rbac.authorization.k8s.io/pod-manager unchanged rolebinding.rbac.authorization.k8s.io/manage-pods unchanged clusterrole.rbac.authorization.k8s.io/node-reader unchanged clusterrolebinding.rbac.authorization.k8s.io/read-nodes unchanged priorityclass.scheduling.k8s.io/high-priority configured clusterrole.rbac.authorization.k8s.io/namespace-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-namespaces unchanged clusterrole.rbac.authorization.k8s.io/secret-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-secrets unchanged clusterrole.rbac.authorization.k8s.io/priorityclass-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-priorityclasses unchanged clusterrole.rbac.authorization.k8s.io/poddisruptionbudget-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-poddisruptionbudgets unchanged clusterrole.rbac.authorization.k8s.io/serviceaccount-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-serviceaccounts unchanged clusterrole.rbac.authorization.k8s.io/apiservices-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manage-apiservices unchanged clusterrole.rbac.authorization.k8s.io/crd-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-crds unchanged clusterrole.rbac.authorization.k8s.io/clusterrole-manager unchanged clusterrolebinding.rbac.authorization.k8s.io/manager-clusterroles unchanged clusterrole.rbac.authorization.k8s.io/deployment-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-deployments unchanged clusterrole.rbac.authorization.k8s.io/webhookconfiguration-creator unchanged clusterrolebinding.rbac.authorization.k8s.io/create-webhookconfigurations unchanged clusterrole.rbac.authorization.k8s.io/allocator unchanged clusterrole.rbac.authorization.k8s.io/controller unchanged clusterrole.rbac.authorization.k8s.io/sdk unchanged clusterrolebinding.rbac.authorization.k8s.io/allocator unchanged clusterrolebinding.rbac.authorization.k8s.io/controller-access unchanged clusterrolebinding.rbac.authorization.k8s.io/controller:system:auth-delegator unchanged rolebinding.rbac.authorization.k8s.io/controller-auth-reader unchanged rolebinding.rbac.authorization.k8s.io/sdk-access unchanged clusterrole.rbac.authorization.k8s.io/helm-cleanup unchanged clusterrolebinding.rbac.authorization.k8s.io/helm-cleanup-access unchanged kubectl apply -f versionMap.yaml on cluster gke-autopilot-upgrade-test-cluster-1-32 configmap/version-map configured kubectl apply -f gameserverTemplate.yaml on cluster gke-autopilot-upgrade-test-cluster-1-32 configmap/gameserver-template unchanged kubectl apply -f upgradeTest.yaml on cluster gke-autopilot-upgrade-test-cluster-1-32 Warning: autopilot-default-resources-mutator:Autopilot updated Job default/upgrade-test-runner: defaulted unspecified 'cpu' resource for containers [upgrade-test-controller] (see http://g.co/gke/autopilot-defaults). job.batch/upgrade-test-runner created pod/upgrade-test-runner-jlsfz condition met Wait for job upgrade-test-runner to complete or fail on cluster gke-autopilot-upgrade-test-cluster-1-32 SuccessCriteriaMetSuccessCriteriaMetFailureTargetSuccessCriteriaMet/tmp/tmp.aPfSQ3MT5V/standard-upgrade-test-cluster-1-31.log: SuccessCriteriaMet Complete/tmp/tmp.aPfSQ3MT5V/standard-upgrade-test-cluster-1-30.log: Complete /tmp/tmp.aPfSQ3MT5V/standard-upgrade-test-cluster-1-32.log: SuccessCriteriaMet /tmp/tmp.aPfSQ3MT5V/gke-autopilot-upgrade-test-cluster-1-31.log: FailureTarget Cleaning up any remaining running pids: 2856
  • As starting thought is to output the log file contents on failure, so we can see what the issue is.

markmandel avatar Apr 23 '25 19:04 markmandel

More details here: https://github.com/googleforgames/agones/pull/4161#issuecomment-2825287277

Agreed - seems rather flaky. @igooch does it make sense to cat the log of the file on failure? Just so we can see what's up?

Yes, the logs are rather verbose, but if we're OK with that we can do a dump of the upgrade-test-controller, sdk-client-test and / or the agones sidecar / controller containers / pods as well.

For this build it looks like it fails during upgrade from 1.44 -> 1.45.

Install of 1.45 begins:

2025/04/23 00:40:06 Running command helm [upgrade --install --atomic --wait --timeout=10m --namespace=agones-system --create-namespace --version 1.45.0 --set agones.image.tag=1.45.0 --set agones.image.registry=us-docker.pkg.dev/agones-images/release --set agones.image.allocator.pullPolicy=Always --set agones.image.controller.pullPolicy=Always --set agones.image.extensions.pullPolicy=Always --set agones.image.ping.pullPolicy=Always --set agones.image.sdk.alwaysPull=true --set agones.controller.logLevel=debug agones agones/agones]

Ongoing creating of 1.44 game servers continues:

2025/04/23 00:40:11 Running command kubectl [create -f /tmp/gs1440.yaml]
2025/04/23 00:40:11 CombinedOutput: gameserver.agones.dev/sdk-client-test-9mlgv created

Creating of 1.44 game servers temporarily fails (expected) while the controller service switches to the new endpoint during upgrade:

2025/04/23 00:40:21 Running command kubectl [create -f /tmp/gs1440.yaml]
2025/04/23 00:40:21 CombinedOutput: Error from server (InternalError): error when creating "/tmp/gs1440.yaml": Internal error occurred: failed calling webhook "mutations.agones.dev": failed to call webhook: Post "https://agones-controller-service.agones-system.svc:443/mutate?timeout=10s": no endpoints available for service "agones-controller-service"
2025/04/23 00:40:21 CombinedOutput err: exit status 1
2025/04/23 00:40:21 Could not create Gameserver /tmp/gs1440.yaml: exit status 1. Retries left: 8.

Failure continues for longer than expected, and the test fails:

2025/04/23 00:41:01 Could not create Gameserver /tmp/gs1440.yaml: exit status 1. Retries left: 0.
2025/04/23 00:41:06 Running command kubectl [create -f /tmp/gs1440.yaml]
2025/04/23 00:41:06 CombinedOutput: Error from server (InternalError): error when creating "/tmp/gs1440.yaml": Internal error occurred: failed calling webhook "mutations.agones.dev": failed to call webhook: Post "https://agones-controller-service.agones-system.svc:443/mutate?timeout=10s": no endpoints available for service "agones-controller-service"
2025/04/23 00:41:06 CombinedOutput err: exit status 1
2025/04/23 00:41:06 Could not create Gameserver /tmp/gs1440.yaml: exit status 1. Too many successive errors.

Looking at the controller logs, it's possible the slow upgrade was due to a node not being ready, although there's a 5 min difference in time on the logs, so it may be unrelated.

2025-04-23 00:46 0/5 nodes are available: 1 Insufficient ephemeral-storage, 3 Insufficient cpu, 4 Insufficient memory.

markmandel avatar Apr 23 '25 19:04 markmandel

There seems to be one less conditions when it's working, from 3 SuccessCriteriaMet to 3 + 1 FailureTarget on error SuccessCriteriaMetSuccessCriteriaMetFailureTargetSuccessCriteriaMet/tmp/tmp.aPfSQ3MT5V/standard-upgrade-test-cluster-1-31.log: SuccessCriteriaMet (build link) SuccessCriteriaMetSuccessCriteriaMetSuccessCriteriaMet/tmp/tmp.uoo0XNWFhG/standard-upgrade-test-cluster-1-31.log: SuccessCriteriaMet (build link)

Maybe worth adding the reason or the message in the logs on error as well ? (if there is any values available) https://github.com/googleforgames/agones/blob/main/cloudbuild.yaml#L380

lacroixthomas avatar Apr 23 '25 19:04 lacroixthomas

'This issue is marked as Stale due to inactivity for more than 30 days. To avoid being marked as 'stale' please add 'awaiting-maintainer' label or add a comment. Thank you for your contributions '

github-actions[bot] avatar Jun 01 '25 10:06 github-actions[bot]

Moving to awaiting-maintainer - since this is rather irritating!

markmandel avatar Jun 03 '25 22:06 markmandel