noobaa-core
noobaa-core copied to clipboard
Path is not writable(/opt/app-root/src), DAS backup script is not able to create noobaa_db.backup at this location: opt/app-root/src in the pod
Environment info
- NooBaa Version:
- [root@hpo-app11 ~]# noobaa version INFO[0000] CLI version: 5.11.0 INFO[0000] noobaa-image: noobaa/noobaa-core:5.10.0-20220120 INFO[0000] operator-image: noobaa/noobaa-operator:5.11.0 [root@hpo-app11 ~]#
ODF Version:
[root@hpo-app11 ~]# oc get csv -n openshift-storage
NAME DISPLAY VERSION REPLACES PHASE
mcg-operator.v4.11.4 NooBaa Operator 4.11.4 mcg-operator.v4.11.3 Succeeded
ocs-operator.v4.11.4 OpenShift Container Storage 4.11.4 ocs-operator.v4.11.3 Succeeded
odf-csi-addons-operator.v4.11.4 CSI Addons 4.11.4 odf-csi-addons-operator.v4.11.3 Succeeded
odf-operator.v4.11.4 OpenShift Data Foundation 4.11.4 odf-operator.v4.11.3 Succeeded
[root@hpo-app11 ~]#
- Platform:
- [root@hpo-app11 ~]# oc version Client Version: 4.11.13 Kustomize Version: v4.5.4 Server Version: 4.11.13 Kubernetes Version: v1.24.6+5157800 [root@hpo-app11 ~]#
# backup noobaa db
BACKUP_DB_FILE=noobaa_db.backup
CMD="oc exec -n openshift-storage -it noobaa-db-pg-0 -- pg_dump nbcore -f $BACKUP_DB_FILE -F custom"
actually from above command, BACKUP_DB_FILE path is not writable(/opt/app-root/src), so if we use other writable path like /tmp/noobaa_db.backup it works, without write access the script are not able to create noobaa_db.backup at this location opt/app-root/src in the pod.
Command execution with current path:
[root@hpo-app11 das-db-backup]# oc -c db exec -n openshift-storage -it noobaa-db-pg-0 -- pg_dump nbcore -f noobaa_db.backup -F custom
pg_dump: error: could not open output file "noobaa_db.backup": Permission denied
command terminated with exit code 1
Execution with new writable path:
[root@hpo-app11 das-db-backup]# oc -c db exec -n openshift-storage -it noobaa-db-pg-0 -- pg_dump nbcore -f /tmp/noobaa_db.backup -F custom
[root@hpo-app11 das-db-backup]# oc -c db exec -n openshift-storage -it noobaa-db-pg-0 -- ls -lZ /tmp
total 224
-rwx------. 1 root root system_u:object_r:container_file_t:s0:c111,c234 291 Nov 1 04:36 ks-script-3bicx5f2
-rwx------. 1 root root system_u:object_r:container_file_t:s0:c111,c234 701 Nov 1 04:36 ks-script-johuwdtx
-rw-r--r--. 1 10001 root system_u:object_r:container_file_t:s0:c111,c234 110524 Dec 13 11:16 noobaa_db.backup
-rw-r--r--. 1 10001 root system_u:object_r:container_file_t:s0:c111,c234 110247 Dec 13 11:00 test.db
[root@hpo-app11 das-db-backup]#
Current path:
[root@hpo-app11 das-db-backup]# oc -c db exec -n openshift-storage -it noobaa-db-pg-0 -- pwd
/opt/app-root/src
As discussed on slack, there was no related NooBaa change that could cause this, we suspect it was due to db image change on downstream build. our suggestion is to change the path, as @baum suggested to /var/lib/pgsql, which is writable by the DB by design.
@liranmauda @dannyzaken Was this solved as part of ODF builds?
@nimrod-becker I think this should be solved in DAS operator, and the path should be different. talking to @romayalon it seems that it is a new postgress image that downstream is using.
As discussed on slack, there was no related NooBaa change that could cause this, we suspect it was due to db image change on downstream build. our suggestion is to change the path, as @baum suggested to /var/lib/pgsql, which is writable by the DB by design.
@nimrod-becker , as per our last interlock discussion, you would check the Postgres image in ODF 4.12 and the reason for this change.
If you could let us know in a day or two, we would need to work on this change accordingly in our DAS code base.
It seems 4.12 doesn't have these issues anymore (with no code changes). Can you please verify with a new deployment? If this still occurs, we need to go with Liran's suggestion
we will check once the system is deployed with the ODF downstream build. Asked one person in the team to check for the same but didn't get any response
The issue exist on ODF 4.12 as well. We have verified the same issue, and could able to re-produced on ODF 4.12.0-rc.6 build. Followed below steps:
[[email protected] backup-folder]# mkdir -p das/scripts
[[email protected] backup-folder]# oc cp ibm-spectrum-scale-das/$(oc -n ibm-spectrum-scale-das get pods -l app=das-endpoint -o=jsonpath='{.items[0].metadata.name}'):scripts/ /tmp/das/scripts
[[email protected] backup-folder]# chmod +x /tmp/das/scripts/*
[[email protected] backup-folder]# ls -ltr /tmp/das/scripts
total 12
-rwxr-xr-x 1 root root 8138 Jan 17 02:03 dasS3Restore.sh
-rwxr-xr-x 1 root root 3953 Jan 17 02:03 dasS3Backup.sh
[[email protected] backup-folder]# mkdir /tmp/das/backup
ERROR :
[[email protected] backup-folder]# /tmp/das/scripts/dasS3Backup.sh /tmp/das/backup
2023-01-17T02:04:38 ERROR: Failed to run pg_dump in the noobaa-db-pg-0 pod
ODF Version:
[[email protected] backup-folder]# oc get csv -n openshift-storage
NAME DISPLAY VERSION REPLACES PHASE
mcg-operator.v4.12.0-152.stable NooBaa Operator 4.12.0-152.stable Succeeded
metallb-operator.4.12.0-202301042354 MetalLB Operator 4.12.0-202301042354 Succeeded
ocs-operator.v4.12.0-152.stable OpenShift Container Storage 4.12.0-152.stable Succeeded
odf-csi-addons-operator.v4.12.0-152.stable CSI Addons 4.12.0-152.stable Succeeded
odf-operator.v4.12.0-152.stable OpenShift Data Foundation 4.12.0-152.stable Succeeded
[[email protected] backup-folder]# oc get subscription -n openshift-storage
NAME PACKAGE SOURCE CHANNEL
mcg-operator-stable-4.12-ocs-catalogsource-openshift-marketplace mcg-operator ocs-catalogsource stable-4.12
ocs-operator-stable-4.12-ocs-catalogsource-openshift-marketplace ocs-operator ocs-catalogsource stable-4.12
odf-csi-addons-operator-stable-4.12-ocs-catalogsource-openshift-marketplace odf-csi-addons-operator ocs-catalogsource stable-4.12
odf-operator odf-operator ocs-catalogsource stable-4.12
OCP Version:
[[email protected] backup-folder]# oc version
Client Version: 4.11.9
Kustomize Version: v4.5.4
Server Version: 4.12.0-rc.6
Kubernetes Version: v1.25.4+77bec7a
Images :
- name: ROOK_CEPH_IMAGE
value: quay.io/rhceph-dev/odf4-rook-ceph-rhel8-operator@sha256:60f1ae2a2a28802fceca9a75252cec045755ca2fc0679f9693b185188561d86e
- name: CEPH_IMAGE
value: quay.io/rhceph-dev/rhceph@sha256:c6fe7e71ad1b13281d1d2399ceb98d3d6927df40e5d442a15fa0dee2976ccbcf
- name: NOOBAA_CORE_IMAGE
value: quay.io/rhceph-dev/odf4-mcg-core-rhel8@sha256:b495b59219d78ab468d1e1faedacfda59cb4b9fe13b253157897ff6899811de5
- name: NOOBAA_DB_IMAGE
value: quay.io/rhceph-dev/rhel8-postgresql-12@sha256:f4d8f5f165da493568802b4115f5e68af7cc11a3f14769e495de4a3f61a58238
- name: PROVIDER_API_SERVER_IMAGE
value: quay.io/rhceph-dev/odf4-ocs-rhel8-operator@sha256:c4e3463ccb0cf38f7feb71b1cfcd55de006e598d4b8fa3c9eb9175c8083fe0ce
- name: OPERATOR_CONDITION_NAME
value: ocs-operator.v4.12.0-152.stable
image: quay.io/rhceph-dev/odf4-ocs-rhel8-operator@sha256:c4e3463ccb0cf38f7feb71b1cfcd55de006e598d4b8fa3c9eb9175c8083fe0ce
imagePullPolicy: Always
Something differs in the way this deployment is done since we don't see it at all during 4.12 runs
@nimrod-becker , as the downstream is been used to test this from Quay.io (the procedure is similar to what was followed earlier). Any other suggestion for this ? Are the Postgres image same across 4.12 that was tried in your env and in our env ?
If it's the same build it's the same images. In any case, a suggestion was already made 2 weeks ago, and it should solve the issue, its also not a big change Liran's Comment
If it's the same build it's the same images. In any case, a suggestion was already made 2 weeks ago, and it should solve the issue, its also not a big change Liran's Comment
@nimrod-becker , we will change this in DAS code, however what you meant by Deployment is different ? I am not sure what/how it was tried in your ODF env
If it's the same build it's the same images. In any case, a suggestion was already made 2 weeks ago, and it should solve the issue, its also not a big change Liran's Comment
@nimrod-becker , we will change this in DAS code, however what you meant by Deployment is different ? I am not sure what/how it was tried in your ODF env
@nimrod-becker , update: This was tried on the ODF 4.12 GA code level with the below postgres image
oc get csv -n openshift-storage -o yaml |grep -i full full_version: 4.12.0-173 full_version: 4.12.0-173 full_version: 4.12.0-173
Image: registry.redhat.io/rhel8/postgresql-12@sha256:3d805540d777b09b4da6df99e7cddf9598d5ece4af9f6851721a9961df40f5a1
We need to change our scripts to ensure that back-up is created. So with Liran proposed change we will deal with it in our upcoming release.
This issue had no activity for too long - it will now be labeled stale. Update it to prevent it from getting closed.
This issue is stale and had no activity for too long - it will now be closed.