postgres-operator icon indicating copy to clipboard operation
postgres-operator copied to clipboard

Invalid data directory permissions after pod rescheduling

Open a13x5 opened this issue 5 years ago • 6 comments

Environment: kubernetes 1.17.4 rook-ceph as persistent storage provider postgres-operator 1.4.0 installed with helm chart posgresql resource declared as follows:

kind: postgresql
metadata:
  name: test-postgres
  namespace: test
spec:
  teamId: test
  volume:
    size: 50Gi
    storageClass: rook-ceph-block
  numberOfInstances: 2
  users:
    zalando:
    - superuser
    - createdb
    service: []
  databases:
    db_1: service
    db_2: service
  postgresql:
    version: "12"
    parameters:
      max_connections: "500"

Problem: After pod with postgresql instance had been rescheduled to another node postgresql server doesn't start with following error:

2020-05-09 14:07:17 UTC [2507]: [2-1] 5eb6b915.9cb 0     DETAIL:  Permissions should be u=rwx (0700) or u=rwx,g=rx (0750).                                                                      
2020-05-09 14:07:17,085 INFO: postmaster pid=2507                                                                                                                                               
/var/run/postgresql:5432 - no response                                                                                                                                                          
2020-05-09 14:07:17,116 WARNING: Postgresql is not running```
/home/postgres/pgdata/pgroot/data - have following permissions:
```# ls -alh pgdata/pgroot
total 16K
drwxrwsr-x  4 postgres postgres 4.0K Apr 29 11:34 .
drwxrwsrwx  4 root         1337 4.0K Apr 29 11:33 ..
drwxrws--- 19 postgres postgres 4.0K May  9 14:05 data
drwxrwsr-x  2 postgres postgres 4.0K May  5 20:49 pg_lo```

Workaround:  change permissions to 750 by hands

a13x5 avatar May 12 '20 11:05 a13x5

Sounds like the same issue as in #676, especially this comment

FxKu avatar May 12 '20 16:05 FxKu

Looks the same, but we don't use GKE. We use a custom kubernetes installer on openstack.

a13x5 avatar May 12 '20 16:05 a13x5

I think, we already have an issue for this in the Spilo repo.

FxKu avatar May 13 '20 09:05 FxKu

Any news or plans regarding this issue, started using the operator and we are running into the same problem.

stoetti avatar Jun 08 '20 06:06 stoetti

Also experiencing the same problem here

connorearl avatar Jul 07 '20 13:07 connorearl

We are also occasionally experiencing this issue

Related #1850

Probably fixed by #2092

heilerich avatar Oct 07 '24 14:10 heilerich