metering-operator icon indicating copy to clipboard operation
metering-operator copied to clipboard

Add emptydir to hive statefulsets

Open jaxxstorm opened this issue 6 years ago • 9 comments

We are currently using podsecuritypolicies which set the filesystem of containers to readonly. We solve a bunch of these problems by putting an emptydir on /tmp for the reporting-operator deployment.

However, currently this is hard to do with hive because of the usage of the helm chart operator.

We're getting this:


k logs hive-server-0 -n metering
--
Setting HADOOP_HEAPSIZE to 250M
/hadoop-config/core-site.xml doesnt exist, skipping symlink
/hadoop-config/hdfs-site.xml doesnt exist, skipping symlink
ln: failed to create symbolic link '/opt/hive-2.3.3/conf/hive-site.xml': Read-only file system

Is there anyway to specify a volume on both the hive pods to fix this issue, or do we have to set the filesystem to be writeable?

jaxxstorm avatar Apr 03 '19 02:04 jaxxstorm

The config files have to live at $HIVE_HOME/conf (currently /opt/hive-2.3.3), and $HIVE_HOME has to be in the rootfs, not tmpfs as it's got the entire hive installation.

The config files are files from a set of configmaps that are symlinked into the correct location at startup because we need to put a few config files from different configmaps into the same directory.

Mounting the configmaps directly at to $HIVE_HOME/conf won't work since we cannot have 2 different configmaps/secrets mount their files into the same directory, otherwise we would need a configmap per file, and have to use subpath on each. This doesn't work well either since some configmaps are shared between different pods.

So by the sounds of it, you're thinking if we could use an emptyDir at /opt/hive-2.3.3/conf/ could work as that will be a writable volume? That should be doable, but I'd like to avoid hard-coding the version in the path, so I may need to update the image to do make that possible.

chancez avatar Apr 03 '19 16:04 chancez

I started this in #666.

chancez avatar Apr 03 '19 16:04 chancez

Can you try an install with your PSP and set both spec.presto.hive.server.image.tag and spec.presto.hive.metastore.image.tag to pr-666 in your Metering CR?

chancez avatar Apr 03 '19 18:04 chancez

I'll give it a try, standby!

jaxxstorm avatar Apr 03 '19 20:04 jaxxstorm

I'll also give it a try and see if we can just default to readOnlyRootFilesystem in our pods securityContexts.

chancez avatar Apr 03 '19 20:04 chancez

Looks like I need to also do /etc/hadoop, and /tmp, across all containers using those also.

chancez avatar Apr 03 '19 21:04 chancez

Hmm, it seems that I'm running into the issue of /etc/passwd not being writable which is truely a hard thing to correctly handle since we need to be able to write to it in openshift to handle dynamic UID assignments. I'm gonna have to think on this.

chancez avatar Apr 03 '19 21:04 chancez

I just want to thank you for your amazing responsiveness on this @chancez and generally your help with all issues on this project. It is truly appreciated

jaxxstorm avatar Apr 03 '19 22:04 jaxxstorm

@jaxxstorm I've thought of one way to potentially handle this, but it's quite involved, and does require changing all of our base images, so it will take some time before I have resources to come back to this.

The TL;DR is we'll probably update the base image with a copy of /etc/passwd stored in say, /passwd/passwd, and change /etc/passwd be a symlink to /passwd/passwd.

Then, in Kube, we'll mount an emptyDir at /passwd to ensure we can write to files within /passwd and the entrypoint will write the correct contents to /passwd/passwd, since /etc/passwd will be a symlink, it should be correctly up to date after that.

However, this all assumes whatever is looking at /etc/passwd follows symlinks, so it's something we need to validate.

chancez avatar Apr 04 '19 16:04 chancez