awx-operator icon indicating copy to clipboard operation
awx-operator copied to clipboard

After upgrade "Kind=PodList err-Index with name field:status.phase does not exist"

Open nscblauensteiner opened this issue 2 years ago • 9 comments

ISSUE TYPE

Bug report

SUMMARY

Attempting to upgrade from AWX 21.3.0 to 21.4.0

ENVIRONMENT
  • AWX version: 21.3.0
  • AWX install method: operator
  • AWX deployment target: kubernetes/minikube/cri-o
  • Operating System: AlmaLinux 8.6
STEPS TO REPRODUCE

Upgrade operator 0.24.0 to 0.26.0 with "make deploy"

EXPECTED RESULTS

Upgrade with "make deploy" to be successful

ACTUAL RESULTS

Error message in controller-manager "kubectl logs -f deployments/awx-operator-controller-manager -c awx-manager":

TASK [installer : Get new postgres pod information] **************************** task path: /opt/ansible/roles/installer/tasks/upgrade_postgres.yml:45 {"level":"info","ts":1660293743.5016606,"logger":"logging_event_handler","msg":"[playbook task start]","name":"awx","namespace":"awx","gvk":"awx.ansible.com/v1beta1, Kind=AWX","event_type":"playbook_on_task_start","job":"4762220260129429432","EventData.Name":"installer : Get new postgres pod information"} {"level":"info","ts":1660293744.2908022,"logger":"proxy","msg":"cache miss: /v1, Kind=PodList err-Index with name field:status.phase does not exist"}

nscblauensteiner avatar Aug 12 '22 08:08 nscblauensteiner

@rooftopcellist is this possibly related to the Ansible Operator SDK upgrade?

akus062381 avatar Aug 17 '22 15:08 akus062381

@rooftopcellist is this possibly related to the Ansible Operator SDK upgrade?

I am deploying AWX on a fresh kubernetes cluster and I just go the same: image

aimcod avatar Aug 22 '22 15:08 aimcod

I have similar error when trying to deploy AWX on fresh OCI Kubernetes environment


--------------------------- Ansible Task StdOut -------------------------------

TASK [installer : Get the postgres pod information] ****************************
task path: /opt/ansible/roles/installer/tasks/database_configuration.yml:196

-------------------------------------------------------------------------------
{"level":"info","ts":1661331036.1270256,"logger":"proxy","msg":"cache miss: /v1, Kind=PodList err-Index with name field:status.phase does not exist"}
{"level":"info","ts":1661331036.2232614,"logger":"logging_event_handler","msg":"[playbook task start]","name":"awx","namespace":"awx","gvk":"awx.ansible.com/v1beta1, Kind=AWX","event_type":"playbook_on_task_start","job":"418623429143691413","EventData.Name":"installer : Wait for Database to initialize if managed DB"}

--------------------------- Ansible Task StdOut -------------------------------

TASK [installer : Wait for Database to initialize if managed DB] ***************
task path: /opt/ansible/roles/installer/tasks/database_configuration.yml:206

-------------------------------------------------------------------------------
{"level":"info","ts":1661331036.8066926,"logger":"proxy","msg":"cache miss: /v1, Kind=PodList err-Index with name field:status.phase does not exist"}
{"level":"info","ts":1661331042.4789498,"logger":"proxy","msg":"cache miss: /v1, Kind=PodList err-Index with name field:status.phase does not exist"}
{"level":"info","ts":1661331048.105519,"logger":"proxy","msg":"cache miss: /v1, Kind=PodList err-Index with name field:status.phase does not exist"}
{"level":"info","ts":1661331053.7363572,"logger":"proxy","msg":"cache miss: /v1, Kind=PodList err-Index with name field:status.phase does not exist"}
{"level":"info","ts":1661331059.3696618,"logger":"proxy","msg":"cache miss: /v1, Kind=PodList err-Index with name field:status.phase does not exist"}
{"level":"info","ts":1661331065.0136898,"logger":"proxy","msg":"cache miss: /v1, Kind=PodList err-Index with name field:status.phase does not exist"}
{"level":"info","ts":1661331070.642434,"logger":"proxy","msg":"cache miss: /v1, Kind=PodList err-Index with name field:status.phase does not exist"}

ghost avatar Aug 24 '22 08:08 ghost

@Hermanni93 I am not seeing this on fresh installs from the devel branch (just pulled latest).

My db initialize task passed after 2 cache misses, which is normal because it takes some time for the pod to become available.

Logs paste (expand)
Logs from awx-operator container, vanilla AWX CR
TASK [installer : Wait for Database to initialize if managed DB] ***************
task path: /opt/ansible/roles/installer/tasks/database_configuration.yml:206

-------------------------------------------------------------------------------
{"level":"info","ts":1661443521.696671,"logger":"logging_event_handler","msg":"[playbook task start]","name":"awx","namespace":"ca-awx","gvk":"awx.ansible.com/v1beta1, Kind=AWX","event_type":"playbook_on_task_start","job":"5420764487062725234","EventData.Name":"installer : Wait for Database to initialize if managed DB"}
{"level":"info","ts":1661443522.4634442,"logger":"proxy","msg":"cache miss: /v1, Kind=PodList err-Index with name field:status.phase does not exist"}
{"level":"info","ts":1661443528.3991892,"logger":"proxy","msg":"cache miss: /v1, Kind=PodList err-Index with name field:status.phase does not exist"}

--------------------------- Ansible Task StdOut -------------------------------

TASK [installer : Look up details for this deployment] *************************
task path: /opt/ansible/roles/installer/tasks/database_configuration.yml:223

-------------------------------------------------------------------------------
{"level":"info","ts":1661443528.5961928,"logger":"logging_event_handler","msg":"[playbook task start]","name":"awx","namespace":"ca-awx","gvk":"awx.ansible.com/v1beta1, Kind=AWX","event_type":"playbook_on_task_start","job":"5420764487062725234","EventData.Name":"installer : Look up details for this deployment"}
{"level":"info","ts":1661443529.652844,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/awx.ansible.com/v1beta1/namespaces/ca-awx/awxs/awx","Verb":"get","APIPrefix":"apis","APIGroup":"awx.ansible.com","APIVersion":"v1beta1","Namespace":"ca-awx","Resource":"awxs","Subresource":"","Name":"awx","Parts":["awxs","awx"]}}

@aimcod @akus062381 The Operator SDK work only landed 3 days ago and this issue was created 13 days ago, so the timeline doesn't fit. Also, fresh installs should never enter upgrade_postgres.yml.

I wonder if there is an old PVC hanging around in the namespace you have deployed in with the same name as the one being requested by the new postgres pod. If that were the case, I would expect the PVC to be stuck in the pending state, and the postgres pod wouldn't be available, thus causing the cache-miss you are seeing.

Could you check your pvc's? kubectl get pvc -n <deployment-namespace>

rooftopcellist avatar Aug 25 '22 16:08 rooftopcellist

I just just the following to try to reproduce:

  1. checked out 0.24.0, deployed the awx-operator (make deploy) and created an AWX CR.
  2. populated some dummy resources in the AWX UI
  3. checked out 0.26.0, deployed the awx-operator (make deploy)
  4. watched the logs to see any errors or see if it was hanging on any tasks for too long
  5. observed that the reconciliation loop converged/stopped running.
$ oc logs deployments/awx-operator-controller-manager -c awx-manager -f

I similarly upgraded to devel from there without issue following the same process.

@nscblauensteiner after reading the issue again, I see that you saw this issue going from 0.23.0 --> 0.24.0, I expect that was a transient error that has since been fixed. Could you try upgrading to the new 0.28.0 release and comment here if you still have issues?

rooftopcellist avatar Aug 25 '22 17:08 rooftopcellist

@nscblauensteiner after reading the issue again, I see that you saw this issue going from 0.23.0 --> 0.24.0, I expect that was a transient error that has since been fixed. Could you try upgrading to the new 0.28.0 release and comment here if you still have issues?

@rooftopcellist - Sorry for the late reply. Going from 0.24.0 to 0.28.0 the following error occurs:

`TASK [installer : Create Database if no database is specified] ***************** task path: /opt/ansible/roles/installer/tasks/upgrade_postgres.yml:33


{"level":"info","ts":1661953099.2240002,"logger":"logging_event_handler","msg":"[playbook task start]","name":"awx","namespace":"awx","gvk":"awx.ansible.com/v1beta1, Kind=AWX","event_type":"playbook_on_task_start","job":"2700876882654434590","EventData.Name":"installer : Create Database if no database is specified"} {"level":"info","ts":1661953100.0569692,"logger":"proxy","msg":"Cache miss: apps/v1, Kind=StatefulSet, awx/awx-postgres-13"} {"level":"info","ts":1661953100.0622494,"logger":"proxy","msg":"Cache miss: apps/v1, Kind=StatefulSet, awx/awx-postgres-13"} {"level":"info","ts":1661953100.0664117,"logger":"proxy","msg":"Injecting owner reference"} {"level":"info","ts":1661953100.066802,"logger":"proxy","msg":"Watching child resource","kind":"apps/v1, Kind=StatefulSet","enqueue_kind":"awx.ansible.com/v1beta1, Kind=AWX"} {"level":"info","ts":1661953100.0668259,"msg":"Starting EventSource","controller":"awx-controller","source":"kind source: *unstructured.Unstructured"} {"level":"info","ts":1661953100.0775096,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953105.0847545,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953110.0923655,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953115.1000066,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953120.107237,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953125.1145475,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953130.1218889,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953135.1291647,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953140.1358783,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953145.1520195,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953150.1593595,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953155.1630697,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953160.169816,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953165.1771474,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953170.184331,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953175.1910224,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953180.198264,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953185.2055795,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953190.2112143,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953195.218038,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953200.2255187,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953205.2327266,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953210.2398424,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}} {"level":"info","ts":1661953215.2457268,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/awx/statefulsets/awx-postgres-13","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"statefulsets","Subresource":"","Name":"awx-postgres-13","Parts":["statefulsets","awx-postgres-13"]}}

--------------------------- Ansible Task StdOut -------------------------------

TASK [Create Database if no database is specified] ******************************** fatal: [localhost]: FAILED! => {"changed": true, "duration": 120, "method": "apply", "msg": "StatefulSet awx-postgres-13: Resource apply timed out", "result": {"apiVersion": "apps/v1", "kind": "StatefulSet", "metadata": {"annotations": {"kubectl.kubernetes.io/last-applied-configuration": "{"apiVersion":"apps/v1","kind":"StatefulSet","metadata":{"labels":{"app.kubernetes.io/component":"database","app.kubernetes.io/instance":"postgres-13-awx","app.kubernetes.io/managed-by":"awx-operator","app.kubernetes.io/name":"postgres-13","app.kubernetes.io/operator-version":"0.28.0","app.kubernetes.io/part-of":"awx"},"name":"awx-postgres-13","namespace":"awx"},"spec":{"replicas":1,"selector":{"matchLabels":{"app.kubernetes.io/component":"database","app.kubernetes.io/instance":"postgres-13-awx","app.kubernetes.io/managed-by":"awx-operator","app.kubernetes.io/name":"postgres-13"}},"serviceName":"awx","template":{"metadata":{"labels":{"app.kubernetes.io/component":"database","app.kubernetes.io/instance":"postgres-13-awx","app.kubernetes.io/managed-by":"awx-operator","app.kubernetes.io/name":"postgres-13","app.kubernetes.io/part-of":"awx"}},"spec":{"containers":[{"env":[{"name":"POSTGRESQL_DATABASE","valueFrom":{"secretKeyRef":{"key":"database","name":"awx-postgres-configuration"}}},{"name":"POSTGRESQL_USER","valueFrom":{"secretKeyRef":{"key":"username","name":"awx-postgres-configuration"}}},{"name":"POSTGRESQL_PASSWORD","valueFrom":{"secretKeyRef":{"key":"password","name":"awx-postgres-configuration"}}},{"name":"POSTGRES_DB","valueFrom":{"secretKeyRef":{"key":"database","name":"awx-postgres-configuration"}}},{"name":"POSTGRES_USER","valueFrom":{"secretKeyRef":{"key":"username","name":"awx-postgres-configuration"}}},{"name":"POSTGRES_PASSWORD","valueFrom":{"secretKeyRef":{"key":"password","name":"awx-postgres-configuration"}}},{"name":"PGDATA","value":"/var/lib/postgresql/data/pgdata"},{"name":"POSTGRES_INITDB_ARGS","value":"--auth-host=scram-sha-256"},{"name":"POSTGRES_HOST_AUTH_METHOD","value":"scram-sha-256"}],"image":"postgres:13","imagePullPolicy":"IfNotPresent","name":"postgres","ports":[{"containerPort":5432,"name":"postgres-13"}],"resources":{"requests":{"cpu":"10m","memory":"64Mi"}},"volumeMounts":[{"mountPath":"/var/lib/postgresql/data","name":"postgres-13","subPath":"data"}]}],"priorityClassName":""}},"updateStrategy":{"type":"RollingUpdate"},"volumeClaimTemplates":[{"metadata":{"name":"postgres-13"},"spec":{"accessModes":["ReadWriteOnce"],"resources":{"requests":{"storage":"8Gi"}}}}]}}"}, "creationTimestamp": "2022-08-31T13:38:20Z", "generation": 1, "labels": {"app.kubernetes.io/component": "database", "app.kubernetes.io/instance": "postgres-13-awx", "app.kubernetes.io/managed-by": "awx-operator", "app.kubernetes.io/name": "postgres-13", "app.kubernetes.io/operator-version": "0.28.0", "app.kubernetes.io/part-of": "awx"}, "managedFields": [{"apiVersion": "apps/v1", "fieldsType": "FieldsV1", "fieldsV1": {"f:metadata": {"f:annotations": {".": {}, "f:kubectl.kubernetes.io/last-applied-configuration": {}}, "f:labels": {".": {}, "f:app.kubernetes.io/component": {}, "f:app.kubernetes.io/instance": {}, "f:app.kubernetes.io/managed-by": {}, "f:app.kubernetes.io/name": {}, "f:app.kubernetes.io/operator-version": {}, "f:app.kubernetes.io/part-of": {}}, "f:ownerReferences": {".": {}, "k:{"uid":"832c34b8-174b-47cd-99cd-70228dec23e0"}": {}}}, "f:spec": {"f:podManagementPolicy": {}, "f:replicas": {}, "f:revisionHistoryLimit": {}, "f:selector": {}, "f:serviceName": {}, "f:template": {"f:metadata": {"f:labels": {".": {}, "f:app.kubernetes.io/component": {}, "f:app.kubernetes.io/instance": {}, "f:app.kubernetes.io/managed-by": {}, "f:app.kubernetes.io/name": {}, "f:app.kubernetes.io/part-of": {}}}, "f:spec": {"f:containers": {"k:{"name":"postgres"}": {".": {}, "f:env": {".": {}, "k:{"name":"PGDATA"}": {".": {}, "f:name": {}, "f:value": {}}, "k:{"name":"POSTGRESQL_DATABASE"}": {".": {}, "f:name": {}, "f:valueFrom": {".": {}, "f:secretKeyRef": {}}}, "k:{"name":"POSTGRESQL_PASSWORD"}": {".": {}, "f:name": {}, "f:valueFrom": {".": {}, "f:secretKeyRef": {}}}, "k:{"name":"POSTGRESQL_USER"}": {".": {}, "f:name": {}, "f:valueFrom": {".": {}, "f:secretKeyRef": {}}}, "k:{"name":"POSTGRES_DB"}": {".": {}, "f:name": {}, "f:valueFrom": {".": {}, "f:secretKeyRef": {}}}, "k:{"name":"POSTGRES_HOST_AUTH_METHOD"}": {".": {}, "f:name": {}, "f:value": {}}, "k:{"name":"POSTGRES_INITDB_ARGS"}": {".": {}, "f:name": {}, "f:value": {}}, "k:{"name":"POSTGRES_PASSWORD"}": {".": {}, "f:name": {}, "f:valueFrom": {".": {}, "f:secretKeyRef": {}}}, "k:{"name":"POSTGRES_USER"}": {".": {}, "f:name": {}, "f:valueFrom": {".": {}, "f:secretKeyRef": {}}}}, "f:image": {}, "f:imagePullPolicy": {}, "f:name": {}, "f:ports": {".": {}, "k:{"containerPort":5432,"protocol":"TCP"}": {".": {}, "f:containerPort": {}, "f:name": {}, "f:protocol": {}}}, "f:resources": {".": {}, "f:requests": {".": {}, "f:cpu": {}, "f:memory": {}}}, "f:terminationMessagePath": {}, "f:terminationMessagePolicy": {}, "f:volumeMounts": {".": {}, "k:{"mountPath":"/var/lib/postgresql/data"}": {".": {}, "f:mountPath": {}, "f:name": {}, "f:subPath": {}}}}}, "f:dnsPolicy": {}, "f:restartPolicy": {}, "f:schedulerName": {}, "f:securityContext": {}, "f:terminationGracePeriodSeconds": {}}}, "f:updateStrategy": {"f:type": {}}, "f:volumeClaimTemplates": {}}}, "manager": "OpenAPI-Generator", "operation": "Update", "time": "2022-08-31T13:38:20Z"}, {"apiVersion": "apps/v1", "fieldsType": "FieldsV1", "fieldsV1": {"f:status": {"f:collisionCount": {}, "f:currentReplicas": {}, "f:currentRevision": {}, "f:observedGeneration": {}, "f:replicas": {}, "f:updateRevision": {}, "f:updatedReplicas": {}}}, "manager": "kube-controller-manager", "operation": "Update", "subresource": "status", "time": "2022-08-31T13:38:20Z"}], "name": "awx-postgres-13", "namespace": "awx", "ownerReferences": [{"apiVersion": "awx.ansible.com/v1beta1", "kind": "AWX", "name": "awx", "uid": "832c34b8-174b-47cd-99cd-70228dec23e0"}], "resourceVersion": "27418469", "uid": "bb16e868-0314-43cd-a63a-7858bc463796"}, "spec": {"podManagementPolicy": "OrderedReady", "replicas": 1, "revisionHistoryLimit": 10, "selector": {"matchLabels": {"app.kubernetes.io/component": "database", "app.kubernetes.io/instance": "postgres-13-awx", "app.kubernetes.io/managed-by": "awx-operator", "app.kubernetes.io/name": "postgres-13"}}, "serviceName": "awx", "template": {"metadata": {"creationTimestamp": null, "labels": {"app.kubernetes.io/component": "database", "app.kubernetes.io/instance": "postgres-13-awx", "app.kubernetes.io/managed-by": "awx-operator", "app.kubernetes.io/name": "postgres-13", "app.kubernetes.io/part-of": "awx"}}, "spec": {"containers": [{"env": [{"name": "POSTGRESQL_DATABASE", "valueFrom": {"secretKeyRef": {"key": "database", "name": "awx-postgres-configuration"}}}, {"name": "POSTGRESQL_USER", "valueFrom": {"secretKeyRef": {"key": "username", "name": "awx-postgres-configuration"}}}, {"name": "POSTGRESQL_PASSWORD", "valueFrom": {"secretKeyRef": {"key": "password", "name": "awx-postgres-configuration"}}}, {"name": "POSTGRES_DB", "valueFrom": {"secretKeyRef": {"key": "database", "name": "awx-postgres-configuration"}}}, {"name": "POSTGRES_USER", "valueFrom": {"secretKeyRef": {"key": "username", "name": "awx-postgres-configuration"}}}, {"name": "POSTGRES_PASSWORD", "valueFrom": {"secretKeyRef": {"key": "password", "name": "awx-postgres-configuration"}}}, {"name": "PGDATA", "value": "/var/lib/postgresql/data/pgdata"}, {"name": "POSTGRES_INITDB_ARGS", "value": "--auth-host=scram-sha-256"}, {"name": "POSTGRES_HOST_AUTH_METHOD", "value": "scram-sha-256"}], "image": "postgres:13",{"level":"error","ts":1661953220.3480885,"logger":"logging_event_handler","msg":"","name":"awx","namespace":"awx","gvk":"awx.ansible.com/v1beta1, Kind=AWX","event_type":"runner_on_failed","job":"2700876882654434590","EventData.Task":"Create Database if no database is specified","EventData.TaskArgs":"","EventData.FailedTaskPath":"/opt/ansible/roles/installer/tasks/upgrade_postgres.yml:33","error":"[playbook task failed]"} "imagePullPolicy": "IfNotPresent", "name": "postgres", "ports": [{"containerPort": 5432, "name": "postgres-13", "protocol": "TCP"}], "resources": {"requests": {"cpu": "10m", "memory": "64Mi"}}, "terminationMessagePath": "/dev/termination-log", "terminationMessagePolicy": "File", "volumeMounts": [{"mountPath": "/var/lib/postgresql/data", "name": "postgres-13", "subPath": "data"}]}], "dnsPolicy": "ClusterFirst", "restartPolicy": "Always", "schedulerName": "default-scheduler", "securityContext": {}, "terminationGracePeriodSeconds": 30}}, "updateStrategy": {"type": "RollingUpdate"}, "volumeClaimTemplates": [{"apiVersion": "v1", "kind": "PersistentVolumeClaim", "metadata": {"creationTimestamp": null, "name": "postgres-13"}, "spec": {"accessModes": ["ReadWriteOnce"], "resources": {"requests": {"storage": "8Gi"}}, "volumeMode": "Filesystem"}, "status": {"phase": "Pending"}}]}, "status": {"availableReplicas": 0, "collisionCount": 0, "currentReplicas": 1, "currentRevision": "awx-postgres-13-54b9b564f4", "observedGeneration": 1, "replicas": 1, "updateRevision": "awx-postgres-13-54b9b564f4", "updatedReplicas": 1}}}`

nscblauensteiner avatar Aug 31 '22 13:08 nscblauensteiner

@Hermanni93 I am not seeing this on fresh installs from the devel branch (just pulled latest).

My db initialize task passed after 2 cache misses, which is normal because it takes some time for the pod to become available.

Logs paste (expand) @aimcod @akus062381 The Operator SDK work only landed 3 days ago and this issue was created 13 days ago, so the timeline doesn't fit. Also, fresh installs should never enter upgrade_postgres.yml.

I wonder if there is an old PVC hanging around in the namespace you have deployed in with the same name as the one being requested by the new postgres pod. If that were the case, I would expect the PVC to be stuck in the pending state, and the postgres pod wouldn't be available, thus causing the cache-miss you are seeing.

Could you check your pvc's? kubectl get pvc -n <deployment-namespace>

Hi @rooftopcellist

This is a fresh deployment of 0.28.0 on a fresh K8s cluster(1.25) that is deployed on fresh VMs.

My issue right now is exactly that of #706 , excluding the last comment, where @indraneeldey1 mentioned it is sort-of running for him.

Here are the details

POD STATUS:

[root@dev-awx-01 k8awx]# kubectl get pods

NAME                                              READY   STATUS    RESTARTS   AGE
awx-operator-controller-manager-9589d9859-jhp4q   2/2     Running   0          8m43s
*****-postgres-13-0                               0/1     Pending   0          8m16s
[root@dev-awx-01 k8awx]# kubectl describe pod ****-postgres-13-0
Name:             ****-postgres-13-0
Namespace:        awx
Priority:         0
Service Account:  default
Node:             <none>
Labels:           app.kubernetes.io/component=database
                  app.kubernetes.io/instance=postgres-13-****
                  app.kubernetes.io/managed-by=awx-operator
                  app.kubernetes.io/name=postgres-13
                  app.kubernetes.io/part-of=****
                  controller-revision-hash=****-postgres-13-8677ccdd5d
                  statefulset.kubernetes.io/pod-name=****-postgres-13-0
Annotations:      <none>
Status:           Pending
IP:
IPs:              <none>
Controlled By:    StatefulSet/****-postgres-13
Containers:
  postgres:
    Image:      postgres:13
    Port:       5432/TCP
    Host Port:  0/TCP
    Requests:
      cpu:     10m
      memory:  64Mi
    Environment:
      POSTGRESQL_DATABASE:        <set to the key 'database' in secret '****-postgres-configuration'>  Optional: false
      POSTGRESQL_USER:            <set to the key 'username' in secret '****-postgres-configuration'>  Optional: false
      POSTGRESQL_PASSWORD:        <set to the key 'password' in secret '****-postgres-configuration'>  Optional: false
      POSTGRES_DB:                <set to the key 'database' in secret '****-postgres-configuration'>  Optional: false
      POSTGRES_USER:              <set to the key 'username' in secret '****-postgres-configuration'>  Optional: false
      POSTGRES_PASSWORD:          <set to the key 'password' in secret '****-postgres-configuration'>  Optional: false
      PGDATA:                     /var/lib/postgresql/data/pgdata
      POSTGRES_INITDB_ARGS:       --auth-host=scram-sha-256
      POSTGRES_HOST_AUTH_METHOD:  scram-sha-256
    Mounts:
      /var/lib/postgresql/data from postgres-13 (rw,path="data")
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-zbw28 (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  postgres-13:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  postgres-13-****-postgres-13-0
    ReadOnly:   false
  kube-api-access-zbw28:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age    From               Message
  ----     ------            ----   ----               -------
  Warning  FailedScheduling  9m10s  default-scheduler  0/4 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 4 node(s) didn't find available persistent volumes to bind. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling.
  Warning  FailedScheduling  4m1s   default-scheduler  0/4 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 4 node(s) didn't find available persistent volumes to bind. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling.

PV

[root@dev-awx-01 k8awx]# kubectl get pv
NAME             CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS    REASON   AGE
static-data-pv   11Gi       RWX            Retain           Available           local-storage            11m
[root@dev-awx-01 k8awx]# kubectl describe pv static-data-pv
Name:              static-data-pv
Labels:            <none>
Annotations:       <none>
Finalizers:        [kubernetes.io/pv-protection]
StorageClass:      local-storage
Status:            Available
Claim:
Reclaim Policy:    Retain
Access Modes:      RWX
VolumeMode:        Filesystem
Capacity:          11Gi
Node Affinity:
  Required Terms:
    Term 0:        kubernetes.io/hostname in [dev-awx-]
Message:
Source:
    Type:          HostPath (bare host directory volume)
    Path:          /data/awx
    HostPathType:
Events:            <none>

PVC

[root@dev-awx-01 k8awx]# kubectl get pvc
NAME                              STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS    AGE
postgres-13-*****-postgres-13-0   Pending                                      local-storage   12m
static-data-pvc                   Pending                                      local-storage   12m
[root@dev-awx-01 k8awx]# kubectl describe pvc postgres-13-****-postgres-13-0
Name:          postgres-13-****-postgres-13-0
Namespace:     awx
StorageClass:  local-storage
Status:        Pending
Volume:
Labels:        app.kubernetes.io/component=database
               app.kubernetes.io/instance=postgres-13-****
               app.kubernetes.io/managed-by=awx-operator
               app.kubernetes.io/name=postgres-13
Annotations:   <none>
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Used By:       ****-postgres-13-0
Events:
  Type    Reason                Age                   From                         Message
  ----    ------                ----                  ----                         -------
  Normal  WaitForFirstConsumer  12m                   persistentvolume-controller  waiting for first consumer to be created before binding
  Normal  WaitForPodScheduled   2m26s (x41 over 12m)  persistentvolume-controller  waiting for pod ****-postgres-13-0 to be scheduled

I would appreciate any feedback on this topic.

aimcod avatar Sep 05 '22 12:09 aimcod

After apply chmod for pv, pods get up on fresh install Postgres persistent volume ex: sudo chmod 755 /psql/postgres-13 AWX persistent projects volume ex: sudo chown 1000:0 /awx/projects

Don't try to use your old pv initialized by PostgreSQL 12 without upgrade, this will fail.

exaluc avatar Sep 13 '22 21:09 exaluc

@aimcod AWX Operator creates PVC for PSQL, but does not create PV for the PVC. Seems the PVC has been created by AWX Operator, but there is no usable PVs for your PVC on your K8s cluster (static-data-pv is there but it has RWX access mode). Try creating new PV manually with local-storage class with RWO access mode.

kurokobo avatar Sep 14 '22 22:09 kurokobo

@odgon @rooftopcellist

I have now figured it out. The ansible-playbook creates a wrong claim during the deploy (error: "cache miss: /v1, Kind=PodList err-Index with name field:status.phase does not exist"), which has to be deleted during the deploy and replaced with an own one. This PVC must of course match with name (storageClassName) and storage size, also, at least in my case, the app labels must be passed along.

If necessary, I can DM you the yaml files and instructions.

nscblauensteiner avatar Oct 17 '22 14:10 nscblauensteiner

I ran into this after bulk deleting a bunch of resources in my namespace (included "kustomize build . | kubectl delete -f -" and "kubectl delete pvc --all"), then quickly after deploying the awx-operator using the Kustomization instructions in the Basic Install section of the README.md.

However, after deleting the PVC's and waiting a couple minutes, then trying to deploy the awx-operator and create an AWX instance again, it worked.

I suspect that this is some caching issue with with k8s' etcd. If you see this again, I suggest trying it in a different namespace, or terminating and recreating your namespace if that is an option.

Thanks, AWX Team

On Mon, Oct 17, 2022 at 10:48 AM Lukas B. @.***> wrote:

@odgon https://github.com/odgon @rooftopcellist https://github.com/rooftopcellist

I have now figured it out. The ansible-playbook creates a wrong claim during the deploy (error: "cache miss: /v1, Kind=PodList err-Index with name field:status.phase does not exist"), which has to be deleted during the deploy and replaced with an own one. This PVC must of course match with name (storageClassName) and storage size, also, at least in my case, the app labels must be passed along.

If necessary, I can DM you the yaml files and instructions.

— Reply to this email directly, view it on GitHub https://github.com/ansible/awx-operator/issues/1022#issuecomment-1280986624, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZIFTC47KFF2B6ZDB72XATWDVRNRANCNFSM562AKBFA . You are receiving this because you were mentioned.Message ID: @.***>

--


CHRISTIAN ADAMS

SOFTWARE ENGINEER, ANSIBLE TOWER

@.*** | (919) 218-5080 | Github: rooftopcellist https://www.redhat.com

rooftopcellist avatar Oct 18 '22 04:10 rooftopcellist

@nscblauensteiner Hi! just ran into the same issue. mind sharing you fix? thanks in advance!

bhrbgk avatar Oct 21 '22 11:10 bhrbgk

@nscblauensteiner Hi! just ran into the same issue. mind sharing you fix? thanks in advance!

Hy, sure.

My PV and PVC contain corporate data. If you leave me your email, I will get back to you via PM.

nscblauensteiner avatar Oct 21 '22 11:10 nscblauensteiner

@nscblauensteiner thank you a lot. contact me at [email protected]. have a good day!

bhrbgk avatar Oct 24 '22 05:10 bhrbgk

@nscblauensteiner thank you a lot. contact me at [email protected]. have a good day!

Got error message back: The DNS has reported that the domain of the recipient does not exist

nscblauensteiner avatar Oct 24 '22 06:10 nscblauensteiner

@nscblauensteiner sorry. i should've had a second coffee before typing. it's [email protected] ;)

bhrbgk avatar Oct 24 '22 06:10 bhrbgk

I have the same issue. After a while the installation just stalls to

{"level":"info","ts":1667213815.0013413,"logger":"proxy","msg":"cache miss: /v1, Kind=PodList err-Index with name field:status.phase does not exist"}

Env is k8s with openebs storage (postgres container seems happy with it) Trying with v0.28.0 as latest has issue with init container permissions.

e: might not be the same issue, my issue might be that I had IPv6 disabled in my k8s

vg-mc avatar Oct 31 '22 11:10 vg-mc

@nscblauensteiner I had the same issue, can you share what you did with the yaml files.

acas25 avatar Nov 04 '22 19:11 acas25

@nscblauensteiner I had the same issue, can you share what you did with the yaml files.

Hy, same rule for you :) - Leave me your mail address here

nscblauensteiner avatar Nov 07 '22 09:11 nscblauensteiner

Why is this issue closed but it doesn't seem to have any solution available ?

MatthieuLeMee avatar Nov 14 '22 15:11 MatthieuLeMee

Why is this issue closed but it doesn't seem to have any solution available ?

Try disabling IPv6 on your nodes or go back to v0.24.0 if that works. It did for me.

E: you can try checking the container awx-web if it hangs with nginx trying to get IPv6 address to verify this. You do this by k logs deployment/awx -c awx-web -n yournamespace

vg-mc avatar Nov 15 '22 06:11 vg-mc

@nscblauensteiner I'd be interested in how you fixed this as well. My email is [email protected]

bchutro avatar Nov 21 '22 04:11 bchutro

I had this error when trying to deploy awx 21.5.0 with and recent 1.0.0 awx-operator. I think it's related to awx-ee:latest latest image having problems with old awx:21.5.0 image. Upgrading to 21.8.0 fixed it.

You can check init container logs, it gives some useful information sometimes : kubectl logs awx-767b7d7c7b-72prx init

MatthieuLeMee avatar Nov 21 '22 09:11 MatthieuLeMee

Why are we emailing someone to get the fix? Could someone just post what they did to fix this?

trippinnik avatar Nov 21 '22 20:11 trippinnik

I'm seeing this issue as well, with 1.1.3. Completely new EKS cluster, everything fresh - but it seems no PV is getting created. Seems like awx-operator itself needs a fix?

iuvooneill avatar Dec 23 '22 17:12 iuvooneill

I'm under to the same spell, doesn't work, with a completely new miniKube setup with awx-operator 1.1.3:

{"level":"info","ts":1673269573.8766282,"logger":"proxy","msg":"cache miss: /v1, Kind=PodList err-Index with name field:status.phase does not exist"} {"level":"info","ts":1673269579.516033,"logger":"proxy","msg":"cache miss: /v1, Kind=PodList err-Index with name field:status.phase does not exist"}

mkeology avatar Jan 09 '23 13:01 mkeology

Hi team,

I am trying to use AWS EKS Fargate to deploy the Awx tower, with External RDS PostgresSql server less.

i am stuck at the awx-postgres pod in pending status with error below error.

Pod not supported on Fargate: volumes not supported: postgres-13 not supported because: PVC postgres-13-awx-postgres-13-0 not bound

i have few questions here: Do we need Persistent Volume for sure, if yes how can i make use of efs in AWS.

if some one has already done it can you help me in setting up the environment.

sveerabathini avatar Feb 15 '23 19:02 sveerabathini

@nscblauensteiner Hi! just ran into the same issue. mind sharing you fix? thanks in advance!

Hy, sure.

My PV and PVC contain corporate data. If you leave me your email, I will get back to you via PM.

Hi would you be able to send me the pv and pvc config? Thanks! @nscblauensteiner

ghost avatar Mar 02 '23 06:03 ghost

@nscblauensteiner Hi! just ran into the same issue. mind sharing you fix? thanks in advance!

Hy, sure. My PV and PVC contain corporate data. If you leave me your email, I will get back to you via PM.

Hi would you be able to send me the pv and pvc config? Thanks! @nscblauensteiner

Hy, leave me your email address here.

nscblauensteiner avatar Mar 02 '23 12:03 nscblauensteiner

@nscblauensteiner Hi! just ran into the same issue. mind sharing you fix? thanks in advance!

Hy, sure. My PV and PVC contain corporate data. If you leave me your email, I will get back to you via PM.

Hi would you be able to send me the pv and pvc config? Thanks! @nscblauensteiner

Hy, leave me your email address here.

Hi my email is [email protected]! Thanks a lot.

ghost avatar Mar 02 '23 13:03 ghost