postgres-operator icon indicating copy to clipboard operation
postgres-operator copied to clipboard

Resize tried on Migrate volume to new clusters

Open ccpatrut opened this issue 3 years ago • 3 comments

Dear community,

Overview

Reproducing

I've recently tried to migrate from a Postgres operator 4.7.5 in openshift 4.7 to postgres operator v5.1.2-0

During the migration as described in the guide I defined the volumes similar to:

  dataSource:
    volumes:
      pgDataVolume:
        pvcName: test-mig
        directory: test-mig
      pgWALVolume:
        pvcName: test-mig-wal
      pgBackRestVolume:
        pvcName: test-mig-pgbr-repo
        directory: test-mig-backrest-shared-repo

The three volumes are of two different storage class: pgData is of type san-storage (hence not resizable) pgWALVolume is of type san-storage (hence not resizable) pgBackRestVolume is of type nfs

Error

Due to the nature of the san-storage type of not being resizable I receive the following error:

time="2022-07-13T14:04:47Z" level=error msg="Reconciler error" error="persistentvolumeclaims \"test-mig-wal\" is forbidden: only dynamically provisioned pvc can be resized and the storageclass that provisions the pvc must support resize" file="internal/controller/postgrescluster/volumes.go:275" func="postgrescluster.(*Reconciler).configureExistingPGWALVolume" name=hippo namespace=cpapi002d reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster version=5.1.2-0
time="2022-07-13T14:06:30Z" level=debug msg="reconciled cluster" file="internal/controller/postgrescluster/controller.go:313" func="postgrescluster.(*Reconciler).Reconcile" name=hippo namespace=cpapi002d reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster version=5.1.2-0

Which I tracked at line 275 at https://github.com/CrunchyData/postgres-operator/blob/master/internal/controller/postgrescluster/volumes.go.

Is there any reason for which at this line a resize is tried? Why not just reuse the existing volume as previously defined? I understand that the volume should not be smaller than existing once and I would rather compare if that's the case and through an error accordingly. However trying a resize before the data is even migrated seems cumbersome to me.

=

Overview

Add a concise description of what the bug is.

Environment

Please provide the following details:

  • Platform: OpenShift
  • Platform Version: 4.7.0
  • PGO Image Tag: ubi8-5.1.2-0
  • Postgres Version 13
  • Storage: san-storage, nfs

Steps to Reproduce

REPRO

Already provided

EXPECTED

  1. Data migration works

ACTUAL

  1. Data Migration doesn't work due to resizing

Logs


`time="2022-07-13T13:33:27Z" level=error msg="Reconciler error" error="persistentvolumeclaims \"test-mig-wal\" is forbidden: only dynamically provisioned pvc can be resized and the storageclass that provisions the pvc must support resize" file="internal/controller/postgrescluster/volumes.go:275" func="postgrescluster.(*Reconciler).configureExistingPGWALVolume" name=hippo namespace=cpapi002d reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster version=5.1.2-0
time="2022-07-13T13:49:49Z" level=debug msg="reconciled cluster" file="internal/controller/postgrescluster/controller.go:313" func="postgrescluster.(*Reconciler).Reconcile" name=hippo namespace=cpapi002d reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster version=5.1.2-0
time="2022-07-13T13:49:49Z" level=error msg="Reconciler error" error="persistentvolumeclaims \"test-mig-wal\" is forbidden: only dynamically provisioned pvc can be resized and the storageclass that provisions the pvc must support resize" file="internal/controller/postgrescluster/volumes.go:275" func="postgrescluster.(*Reconciler).configureExistingPGWALVolume" name=hippo namespace=cpapi002d reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster version=5.1.2-0
time="2022-07-13T14:04:39Z" level=debug msg="reconciled cluster" file="internal/controller/postgrescluster/controller.go:313" func="postgrescluster.(*Reconciler).Reconcile" name=hippo namespace=cpapi002d reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster version=5.1.2-0
time="2022-07-13T14:04:39Z" level=error msg="Reconciler error" error="persistentvolumeclaims \"test-mig-wal\" is forbidden: only dynamically provisioned pvc can be resized and the storageclass that provisions the pvc must support resize" file="internal/controller/postgrescluster/volumes.go:275" func="postgrescluster.(*Reconciler).configureExistingPGWALVolume" name=hippo namespace=cpapi002d reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster version=5.1.2-0
time="2022-07-13T14:04:47Z" level=debug msg="reconciled cluster" file="internal/controller/postgrescluster/controller.go:313" func="postgrescluster.(*Reconciler).Reconcile" name=hippo namespace=cpapi002d reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster version=5.1.2-0
time="2022-07-13T14:04:47Z" level=error msg="Reconciler error" error="persistentvolumeclaims \"test-mig-wal\" is forbidden: only dynamically provisioned pvc can be resized and the storageclass that provisions the pvc must support resize" file="internal/controller/postgrescluster/volumes.go:275" func="postgrescluster.(*Reconciler).configureExistingPGWALVolume" name=hippo namespace=cpapi002d reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster version=5.1.2-0
time="2022-07-13T14:06:30Z" level=debug msg="reconciled cluster" file="internal/controller/postgrescluster/controller.go:313" func="postgrescluster.(*Reconciler).Reconcile" name=hippo namespace=cpapi002d reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster version=5.1.2-0
time="2022-07-13T14:06:30Z" level=error msg="Reconciler error" error="persistentvolumeclaims \"test-mig-wal\" is forbidden: only dynamically provisioned pvc can be resized and the storageclass that provisions the pvc must support resize" file="internal/controller/postgrescluster/volumes.go:275" func="postgrescluster.(*Reconciler).configureExistingPGWALVolume" name=hippo namespace=cpapi002d reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster version=5.1.2-0
time="2022-07-13T14:18:29Z" level=debug msg=deleting file="internal/controller/postgrescluster/controller.go:139" func="postgrescluster.(*Reconciler).Reconcile" name=hippo namespace=cpapi002d reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster result="{Requeue:false RequeueAfter:0s}" version=5.1.2-0
time="2022-07-13T14:20:23Z" level=debug msg=deleting file="internal/controller/postgrescluster/controller.go:139" func="postgrescluster.(*Reconciler).Reconcile" name=hippo namespace=cpapi002d reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster result="{Requeue:false RequeueAfter:0s}" version=5.1.2-0`

ccpatrut avatar Jul 13 '22 15:07 ccpatrut

It looks like the volumes (i.e. PVC's) you have defined in your PostgresCluster spec - e.g. dataVolumeClaimSpec, walVolumeClaimSpec, and volumeClaimSpec (for the pgBackRest volume) - do not match what is in your PGO v4 PVC's.

More specifically, if you look at the storage requested in the existing PGO v4 PVC's, does that match what is defined in the PGO v5 spec?

My thinking here is that the volumes request in the PostgresCluster spec are requesting more storage than was configured for the PGO v4 PVCs.

andrewlecuyer avatar Jul 13 '22 17:07 andrewlecuyer

Hi @andrewlecuyer Thanks for your help, It's true, I made it wrong, requested for my wal volume 1G instead of 1Gi 🤦‍♂️

Thanks for pointing it out.

Catalin

ccpatrut avatar Jul 14 '22 15:07 ccpatrut

Hi @andrewlecuyer

I just tried again and while it is true the two requests don't match and everything works perfectly I have problems migrating the wal volume because this is defined as 1G instead of 1Gi for instance.

apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
  name: hippo
spec:

  openshift: true
  imagePullPolicy: Always
  image: registry.connect.redhat.com/crunchydata/crunchy-postgres:ubi8-13.6-1
  postgresVersion: 13
  replicas: 2
  instances:
    - name: database
      resources: 
        limits: 
          cpu: 300m
          memory: 400Mi
        requests:
          cpu: 100m
          memory: 100Mi
      dataVolumeClaimSpec:
        accessModes:
        - "ReadWriteOnce"
        storageClassName: san-storage
        resources:
          requests:
            storage: 1Gi
      walVolumeClaimSpec:
          accessModes:
          - "ReadWriteOnce"
          storageClassName: san-storage
          resources:
            requests:
              storage: 1G
  dataSource:
    volumes:
      pgDataVolume:
        pvcName: test-mig-uoxg
        directory: test-mig-uoxg
      pgWALVolume:
        pvcName: test-mig-uoxg-wal
      pgBackRestVolume:
        pvcName: test-mig-pgbr-repo
        directory: test-mig-backrest-shared-repo

While there is a workaround to move to 1Gi and upgrade from one instance to the other, shouldn't 1G also be supported?

ccpatrut avatar Jul 26 '22 07:07 ccpatrut

1G and 1Gi are actually different sizes: the former being 1 Gigabyte (10^9 or 1,000,000,000 bytes) and the latter being 1 Gibibyte (2^30 or 1,073,741,824 bytes). While they are similar, both in number and how the unit is specified, they are different sizes, so if you are using a storage class that doesn't allow for resizing, you won't be able to go from one to the other.

dsessler7 avatar Apr 21 '23 22:04 dsessler7