helm-charts icon indicating copy to clipboard operation
helm-charts copied to clipboard

Error when deploying TimescaleDB single node variant using helm chart

Open govindsinghnegi opened this issue 3 years ago • 2 comments

Dear TimescalDB Team,

I am trying to deploy TimescaleDB single node variant in our intern Azure Kubernetes cluster (AKS) via Helm. I am using default settings and just overwriting PV size and switching off Patroni's SSL. I have defined a helmfile and the TimescalDB is deployed in the RELEASE section. I am deploying via the command helmfile -e development apply

However, what I can observer is that the master pod is running successfully, but my Replica pods are not getting deployed at all. The default timescaledb-1 replica pod is always in non-ready state:

image

and the logs of the pods are as below: timescaledb-1 timescaledb 2021-05-11 11:20:36,653 ERROR: Error when fetching backup: pg_basebackup exited with code=1 timescaledb-1 timescaledb 2021-05-11 11:20:36,653 WARNING: Trying again in 5 seconds timescaledb-1 timescaledb pg_basebackup: error: server closed the connection unexpectedly timescaledb-1 timescaledb This probably means the server terminated abnormally timescaledb-1 timescaledb before or while processing the request. timescaledb-1 timescaledb 2021-05-11 11:20:41,663 ERROR: Error when fetching backup: pg_basebackup exited with code=1 timescaledb-1 timescaledb 2021-05-11 11:20:41,664 ERROR: failed to bootstrap from leader 'timescaledb-0' timescaledb-1 timescaledb 2021-05-11 11:20:46,642 ERROR: Error creating replica using method pgbackrest: /etc/timescaledb/scripts/pgbackrest_restore.sh exited with code=1 timescaledb-1 timescaledb pg_basebackup: error: server closed the connection unexpectedly timescaledb-1 timescaledb This probably means the server terminated abnormally timescaledb-1 timescaledb before or while processing the request. timescaledb-1 timescaledb 2021-05-11 11:20:46,651 ERROR: Error when fetching backup: pg_basebackup exited with code=1 timescaledb-1 timescaledb 2021-05-11 11:20:46,651 WARNING: Trying again in 5 seconds timescaledb-1 timescaledb pg_basebackup: error: server closed the connection unexpectedly timescaledb-1 timescaledb This probably means the server terminated abnormally timescaledb-1 timescaledb before or while processing the request. timescaledb-1 timescaledb 2021-05-11 11:20:51,665 ERROR: Error when fetching backup: pg_basebackup exited with code=1 timescaledb-1 timescaledb 2021-05-11 11:20:51,665 ERROR: failed to bootstrap from leader 'timescaledb-0' timescaledb-1 timescaledb 2021-05-11 11:20:56,645 ERROR: Error creating replica using method pgbackrest: /etc/timescaledb/scripts/pgbackrest_restore.sh exited with code=1 timescaledb-1 timescaledb pg_basebackup: error: server closed the connection unexpectedly timescaledb-1 timescaledb This probably means the server terminated abnormally timescaledb-1 timescaledb before or while processing the request. timescaledb-1 timescaledb 2021-05-11 11:20:56,654 ERROR: Error when fetching backup: pg_basebackup exited with code=1 timescaledb-1 timescaledb 2021-05-11 11:20:56,655 WARNING: Trying again in 5 seconds timescaledb-1 timescaledb pg_basebackup: error: server closed the connection unexpectedly timescaledb-1 timescaledb This probably means the server terminated abnormally timescaledb-1 timescaledb before or while processing the request. timescaledb-1 timescaledb 2021-05-11 11:21:01,667 ERROR: Error when fetching backup: pg_basebackup exited with code=1 timescaledb-1 timescaledb 2021-05-11 11:21:01,667 ERROR: failed to bootstrap from leader 'timescaledb-0' timescaledb-1 timescaledb 2021-05-11 11:21:06,642 ERROR: Error creating replica using method pgbackrest: /etc/timescaledb/scripts/pgbackrest_restore.sh exited with code=1 timescaledb-1 timescaledb pg_basebackup: error: server closed the connection unexpectedly timescaledb-1 timescaledb This probably means the server terminated abnormally timescaledb-1 timescaledb before or while processing the request. timescaledb-1 timescaledb 2021-05-11 11:21:06,651 ERROR: Error when fetching backup: pg_basebackup exited with code=1 timescaledb-1 timescaledb 2021-05-11 11:21:06,652 WARNING: Trying again in 5 seconds

To give a brief about my current setup: Kubectl: Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.0", GitCommit:"e19964183377d0ec2052d1f1fa930c4d7575bd50", GitTreeState:"clean", BuildDate:"2020-08-26T14:30:33Z", GoVersion:"go1.15", Compiler:"gc", Platform:"darwin/amd64"} Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.5", GitCommit:"54684493f8139456e5d2f963b23cb5003c4d8055", GitTreeState:"clean", BuildDate:"2021-03-22T23:02:59Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}

Helm: version.BuildInfo{Version:"v3.5.2", GitCommit:"167aac70832d3a384f65f9745335e9fb40169dc2", GitTreeState:"dirty", GoVersion:"go1.15.7"}

Helmfile: helmfile version v0.138.4

And here is the relevant content from my helmfile:

repositories:
- name: timescale
  url: https://raw.githubusercontent.com/timescale/timescaledb-kubernetes/master/charts/repo/

releases:
- name: timescaledb-secrets
  namespace: {{ .Namespace | default "default_namespace" }}
  chart: incubator/raw
  values:
  - resources:
    - apiVersion: v1
      data:
        tls.crt: abcde....
        tls.key: abcde....
      kind: Secret
      metadata:
        labels:
          app: timescaledb-timescaledb
          cluster-name: timescaledb
        name: timescaledb-certificate
      type: kubernetes.io/tls
    - apiVersion: v1
      data:
        PATRONI_REPLICATION_PASSWORD: abcde...
        PATRONI_SUPERUSER_PASSWORD: abcde...
        PATRONI_admin_PASSWORD: abcde...
      kind: Secret
      metadata:
        labels:
          app: timescaledb-timescaledb
          cluster-name: timescaledb
        name: timescaledb-credentials
      type: Opaque
    - apiVersion: v1
      kind: Secret
      metadata:
        labels:
          app: timescaledb-timescaledb
          cluster-name: timescaledb
        name: timescaledb-pgbackrest
      type: Opaque

- name: timescaledb
  namespace: {{ .Namespace | default "default_namespace" }}
  chart: timescale/timescaledb-single
  version: 0.9.0
  force: false
  needs:
    - {{ printf "%s/%s" (.Namespace | default "default_namespace") "timescaledb-secrets" }}
  values:
    - persistentVolumes:
        data:
          size: 30Gi
      patroni:
        bootstrap:
          dcs:
            postgresql:
              parameters:
                ssl: 'off'

Please let me know if there is any other configuration that is still missing or guide me through the troubleshooting.

govindsinghnegi avatar May 11 '21 12:05 govindsinghnegi

I also see this happen (sometimes) - I suspect there is some startup race condition that is not properly handled.

bleggett avatar Feb 28 '22 17:02 bleggett

Hi, Does anyone know a manual fix for this ?

gautamvij94 avatar Nov 14 '22 20:11 gautamvij94