metering-operator
metering-operator copied to clipboard
hive PV permissions
Hello!
I am having a permission issue in hive-metastore: the mounted directories have mode rwxrws--- (root.root) and the user is hadoop (1002).
The volumes are dynamically provided by rook/ceph.
I tried to change charts/openshift-metering/templates/hive/hive-metastore-statefulset.yaml as follows:
initContainers:
- name: chown-metastore
image: alpine
command:
- chown
- -R
- 1002:1002
- /var/lib/hive
volumeMounts:
- name: hive-metastore-db-data
mountPath: /var/lib/hive
- name: chown-warehouse
image: alpine
command:
- chown
- -R
- 1002:1002
- {{ .Values.hive.spec.config.sharedVolume.mountPath }}
volumeMounts:
- name: hive-warehouse-data
mountPath: {{ .Values.hive.spec.config.sharedVolume.mountPath }}
To be able to run the above init containers, I had to set securityContext.runAsNonRoot: false
The real issue is that presto runs, mounting the same volume, with uid 1003. So how is that supposed to work? Do I need to change the configuration of the storage provider?
~~I noticed that I had fsGroup: 0 left over in the configuration, from the previous version which required it.
When I remove that, the owner of the directory is 1002 and presto still cannot use it.~~
Nevermind, the fsGroup had no effect since the format has changed. It helped me to actually take the time to understand what it does.
This seems the correct fix to the issue:
hive:
spec:
securityContext:
fsGroup: 0
presto:
spec:
securityContext:
fsGroup: 0
Can it be documented somewhere? Thanks
This works because it sets the GID bit on the volumes to 0, which is the processes GID. The init container approach works because you're running the initContainer as non-root, which we cannot do in Openshift, and openshift chowns the volumes by default (vanilla Kubernetes doesn't seem to do this so running as non-root with volumes is problematic).
We've not documented this because we don't regularly test on non-Openshift environments so we don't have any way to ensure this remains correct and up-to-date. That said, this is something we plan to tackle as soon as we have our first GA release which is coming soon. Once we have time to spend on making the non-openshift installation story better, this will be a lot easier, and we'll even have an official install method using OLM/operatorhub for non-openshift environments.
I'll leave this open so we can remember to add this to the documentation when we start that.