docker-zulip
docker-zulip copied to clipboard
postgres running as root on kubernetes/helm
Following the guide at https://github.com/zulip/docker-zulip/blob/main/kubernetes/chart/zulip/README.md, you have to run postgresql as root/runAsUser: 0 which will fail on default policies. (as seen in https://github.com/zulip/docker-zulip/issues/421#issuecomment-2289142657)
Is it really necessary to run it as root? Does pgroonga require this? ~~Is it possible to use vanilla postgres as a drop in replacement at the cost of full text search quality?~~
Edit:
Looks like tsearch-extras https://github.com/zulip/docker-zulip/issues/92 is needed at the very least.
tsearch-extras is no longer required; we support vanilla postres, with dictionary files (so https://zulip.readthedocs.io/en/latest/production/postgresql.html#cloud-provider-managed-postgresql-e-g-amazon-rds applies).
@timabbott Thanks, but I'm still a bit confused. How can that work with kubernetes? The pod will crash with
jango.db.utils.InternalError: could not open dictionary file "/opt/bitnami/postgresql/share/tsearch_data/en_us.dict": No such file or directory
before i could run the install with --postgresql-missing-dictionaries manually, if you meant that?!
Or is it enough to just mount an en_us.dict under /opt/bitnami/postgresql/share/tsearch_data/ in a https://hub.docker.com/r/bitnami/postgresql image? Can't find anything related to that in the docs.
Why does zulip/zulip-postgresql require to be run as user 0? This wouldn't be an issue otherwise.
(Also seeing no indication of that requirement in https://pgroonga.github.io/install/debian.html)
Is there a repo for zulip/zulip-postgresql that would make it possible to reproduce/have a look at the build? Otherwise, you'd have to piece it together from the history.
I tripped over this today too. I fought with it for a while, it's quite a mess.
Eventually I found that setting the helm value postgresql.primary.containerSecurityContext.enabled=false seems to get zulip-postgresql working. (Running as uid 0... but it is what it is.)
Looking at the history of this, #462 updated the postgresql chart version from 11.1.22 to 15.5.32, and a comment says it was tested. But it clearly didn't work here on a new installation. Maybe root access is only needed for the first-time initialization scripts?
Good to hear that it's possible to run, but I'm too cautious to just ignore postgres running as root without good reason.
The problem is less the chart, and more the image, I believe.
Root access for first time init could make sense, but I don't see it: https://hub.docker.com/layers/zulip/zulip-postgresql/latest/images/sha256-3b8c51f282e49f8e947337a6ae920c93d28d9c4b21ea945ee7ffaa9ee7876553?context=explore
Seems to be running just postgres.
If you want to run it normally, you have to comment out the runAsUser in both values.yaml and values-local.yaml (will be overwritten otherwise). Then you'll get an error about the user not existing, if I remember correctly. It's possible that postgres just runs on an atypical uid.
Edit: Must have done something wrong, when I tried it again, I got:
2024-11-09 12:32:33.582 UTC [1] LOG: starting PostgreSQL 14.10 on x86_64-pc-linux-musl, compiled by gcc (Alpine 13.2.1_git20231014) 13.2.1 20231014, 64-bit
2024-11-09 12:32:33.582 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
2024-11-09 12:32:33.582 UTC [1] LOG: listening on IPv6 address "::", port 5432
2024-11-09 12:32:33.587 UTC [1] FATAL: could not create lock file "/var/run/postgresql/.s.PGSQL.5432.lock": Read-only file system
2024-11-09 12:32:33.589 UTC [1] LOG: database system is shut down
Maybe just a permission issue.
I'm too cautious to just ignore postgres running as root without good reason.
Looks like postgres itself still runs as the postgres user; it's only the container entrypoint (and my exec commands) which run as root.
% kubectl -n zulip exec pod/zulip-postgresql-0 -- ps -a
PID USER TIME COMMAND
1 postgres 0:04 postgres
102 postgres 0:00 postgres: checkpointer
103 postgres 0:00 postgres: background writer
104 postgres 0:01 postgres: walwriter
105 postgres 0:00 postgres: autovacuum launcher
[snip]
68763 root 0:00 ps -a
That looks quite normal to me.
If you want to run it normally, you have to comment out the
runAsUserin bothvalues.yamlandvalues-local.yaml(will be overwritten otherwise).
Well, I think that's because you commented it out; better to set it to 0 (root's uid). It has a value that way, so it won't get overridden. And you'll also need to change several other problematic flags, such as runAsGroup, runAsNonRoot, and readOnlyRootFilesystem, which also get in the way.
This is why setting postgresql.primary.containerSecurityContext.enabled=false seems to be a better approach. It causes the helm chart to omit the whole security context block, which better matches the image's expectations.
Edit: Must have done something wrong, when I tried it again, I got:
2024-11-09 12:32:33.587 UTC [1] FATAL: could not create lock file "/var/run/postgresql/.s.PGSQL.5432.lock": Read-onlyMaybe just a permission issue.
That happened because readOnlyRootFilesystem was still true.
That looks quite normal to me.
Agreed - That seems to be fairly normal, but it still shouldn't be necessary.
readOnlyRootFilesystem
totally missed that. Seems like the bitnami image offloads even /var/run somewhere to be immutable.
The postgres user on the zulip/zulip-postgresql image has uid 70. Changing that to 1001 during build might be an idea.
Setting it up like this got it working for me:
runAsUser: 70 # image local postgres user
runAsGroup: 70
readOnlyRootFilesystem: false # so that the lockfile can be created. Hope that all of the data is still in the volume mount. (Seems to be)
The postgres user on the zulip/zulip-postgresql image has uid 70. Changing that to 1001 during build might be an idea.
Seems like there's quite a mismatch between the bitnami chart and this non-bitnami postgresql image. We both had to dive deep into the details to find a way to make it work at all...
@mikkeschiren do you have a recommendation for how to resolve this?
In the setup I have worked on, I did this as an override:
postgresql:
primary:
containerSecurityContext:
readOnlyRootFilesystem: false
runAsUser: 1001
And that has worked fine for us. But this need proper handling and investigation. I will get back to it.
Just a note, I am looking into this, and see if I can add it to the helm chart as a default setting.
this needs to be fixed, i simply can't run containers in our clusters as root ... looks like there are some fixes here, but folks should be able to stand something up without having to search through issues
not to seem ungrateful, thanks for all the work you do