postgresqlpublication caches can not be synced; but I am not using any of them...?

Open IngwiePhoenix opened this issue 8 months ago • 0 comments

Description

I noticed that one of my pods was failing and it turns out to be the PostgresQL operator. I have been using this for quite a while now to manage users and databases using CRDs and have had no issues. So I suppose I accidentially updated the chart - and now I have these errors; or rather, this one in particular:

{
  "level": "error",
  "ts": "2025-05-08T08:40:29Z",
  "msg": "Could not wait for Cache to sync",
  "controller": "postgresqlpublication",
  "controllerGroup": "postgresql.easymile.com",
  "controllerKind": "PostgresqlPublication",
  "error": "failed to wait for postgresqlpublication caches to sync: timed out waiting for cache to be synced for Kind *v1alpha1.PostgresqlPublication",
  "stacktrace": "sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.1\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:202\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:207\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:233\nsigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/manager/runnable_group.go:219"
}

A little before that, I see this:

{
  "level": "error",
  "ts": "2025-05-08T08:40:09Z",
  "logger": "controller-runtime.source.EventHandler",
  "msg": "if kind is a CRD, it should be installed before calling Start",
  "kind": "PostgresqlPublication.postgresql.easymile.com",
  "error": "no matches for kind \"PostgresqlPublication\" in version \"postgresql.easymile.com/v1alpha1\"",
  "stacktrace": "sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1.1\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/source/kind.go:63\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/loop.go:62\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/loop.go:63\nk8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/poll.go:33\nsigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/source/kind.go:56"
}

I made specifically sure to pin the version I am using. Here is my setup:

apiVersion: v1
kind: Namespace
metadata:
  name: postgres
---
###
# Install PGO
# ...CloudNativePG is installed via OLM
###
apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
  name: postgres-operator
  namespace: kube-system
spec:
  # via: https://github.com/EasyMile/postgresql-operator?tab=readme-ov-file
  repo: https://easymile.github.io/helm-charts
  chart: postgresql-operator
  targetNamespace: postgres
  version: "1.8.0"
---
###
# Cluster config
###
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: default-cluster
  namespace: postgres
spec:
  instances: 1
  enableSuperuserAccess: true
  storage:
    storageClass: "nfs-bunker"
    size: 10Gi
  walStorage:
    storageClass: "local-path"
    size: 5Gi
---
###
# Connect PGO to Cluster
###
apiVersion: postgresql.easymile.com/v1alpha1
kind: PostgresqlEngineConfiguration
metadata:
  name: default-cluster-instance
  namespace: postgres
spec:
  # PostgreSQL Hostname
  host: default-cluster-rw.postgres.svc.kube.birb.it
  # Secret name in the current namespace to find "user" and "password"
  secretName: default-cluster-superuser
  # URI args to add for PostgreSQL URL
  # Default to ""
  #uriArgs: sslmode=disabled
  # Check interval
  # Default to 30s
  checkInterval: 30s
  # Wait for linked resource to be deleted
  # Default to false
  waitLinkedResourcesDeletion: true
---
###
# Example Database
###
apiVersion: postgresql.easymile.com/v1alpha1
kind: PostgresqlDatabase
metadata:
  name: default-cluster-testdb
  namespace: postgres
spec:
  # Engine configuration link
  engineConfiguration:
    # Resource name
    name: default-cluster-instance
    namespace: postgres
  # Database name
  database: testdb
  # Master role name
  # Master role name will be used to create top group role.
  # Database owner and users will be in this group role.
  # Default is ""
  masterRole: "testdb-role"
  # Should drop on delete ?
  # Default set to false
  dropOnDelete: true
  # Wait for linked resource deletion to accept deletion of the current resource
  # See documentation for more information
  # Default set to false
  waitLinkedResourcesDeletion: true
---
###
# Example user
###
apiVersion: postgresql.easymile.com/v1alpha1
kind: PostgresqlUserRole
metadata:
  name: default-instance-testuser
  namespace: postgres
spec:
  # Mode
  mode: MANAGED
  # Role prefix to be used for user created in database engine
  rolePrefix: "test"
  # User password rotation duration in order to roll user/password in secret
  userPasswordRotationDuration: 720h
  # Privileges list
  privileges:
    - # Privilege for the selected database
      privilege: OWNER
      # Database link
      database:
        name: default-cluster-testdb
      # Generated secret name with information for the selected database
      generatedSecretName: default-cluster-testcreds

Do I need to add a Publication object also? I honestly wouldn't know for what, or why.

Expected Behavior

I expected the controller to just keep reconciling, refreshing passwords and just quietly do it's thing.

Actual Behavior

See the log snippets above. I believe an accidential update broke it...

Environment

Kubernetes version: v1.32.4+k3s1
Project Version/Tag: (E.g 1.0.0) Chart version 1.8.0

Steps to reproduce

You should be able to use the YAML above. CloudPG is used to start the database, while the operator is used to manage it.

May 08 '25 08:05 IngwiePhoenix