postgres-operator
postgres-operator copied to clipboard
Invalid data directory permissions after pod rescheduling
Environment: kubernetes 1.17.4 rook-ceph as persistent storage provider postgres-operator 1.4.0 installed with helm chart posgresql resource declared as follows:
kind: postgresql
metadata:
name: test-postgres
namespace: test
spec:
teamId: test
volume:
size: 50Gi
storageClass: rook-ceph-block
numberOfInstances: 2
users:
zalando:
- superuser
- createdb
service: []
databases:
db_1: service
db_2: service
postgresql:
version: "12"
parameters:
max_connections: "500"
Problem: After pod with postgresql instance had been rescheduled to another node postgresql server doesn't start with following error:
2020-05-09 14:07:17 UTC [2507]: [2-1] 5eb6b915.9cb 0 DETAIL: Permissions should be u=rwx (0700) or u=rwx,g=rx (0750).
2020-05-09 14:07:17,085 INFO: postmaster pid=2507
/var/run/postgresql:5432 - no response
2020-05-09 14:07:17,116 WARNING: Postgresql is not running```
/home/postgres/pgdata/pgroot/data - have following permissions:
```# ls -alh pgdata/pgroot
total 16K
drwxrwsr-x 4 postgres postgres 4.0K Apr 29 11:34 .
drwxrwsrwx 4 root 1337 4.0K Apr 29 11:33 ..
drwxrws--- 19 postgres postgres 4.0K May 9 14:05 data
drwxrwsr-x 2 postgres postgres 4.0K May 5 20:49 pg_lo```
Workaround: change permissions to 750 by hands
Sounds like the same issue as in #676, especially this comment
Looks the same, but we don't use GKE. We use a custom kubernetes installer on openstack.
I think, we already have an issue for this in the Spilo repo.
Any news or plans regarding this issue, started using the operator and we are running into the same problem.
Also experiencing the same problem here
We are also occasionally experiencing this issue
Related #1850
Probably fixed by #2092