v2.18 runs into file system errors in Kubernetes
Describe the Bug
First of all, thanks a lot for the new version - data-before-send looks like it is exactly what we need and saves us from implementing a custom tracker wrapper.
Unfortunately, I haven't been able to deploy the image (docker.umami.is/umami-software/umami:postgresql-v2.17.0) into our Kubernetes cluster yet. I get the file-system-permission-related error below in the migration step, more precisely when running applyMigration(). If I skip the migration step, it fails a bit later writing the manifest (log output also below).
I can't say for sure that the issue isn't on our side, but everything is working fine with v2.17.0. Unfortunately, I haven't been able to reproduce the issue locally (on Mac, if that's relevant), running the image via docker-compose works. Sorry I can't produce a minimal reproducible example.
Database
PostgreSQL
Relevant log output
> [email protected] start-docker /app
> npm-run-all check-db update-tracker set-routes-manifest start-server
> [email protected] check-db /app
> node scripts/check-db.js
✓ DATABASE_URL is defined.
✓ Database connection successful.
✓ Database version check successful.
Error: Can't write to /app/node_modules/.pnpm/@[email protected]/node_modules/@prisma/engines please make sure you install "prisma" with the right permissions.
✗ Command failed: prisma migrate deploy
Error: Can't write to /app/node_modules/.pnpm/@[email protected]/node_modules/@prisma/engines please make sure you install "prisma" with the right permissions.
ELIFECYCLE Command failed with exit code 1.
ERROR: "check-db" exited with 1.
ELIFECYCLE Command failed with exit code 1.
---
> [email protected] start-docker /app
> npm-run-all check-db update-tracker set-routes-manifest start-server
> [email protected] check-db /app
> node scripts/check-db.js
✓ DATABASE_URL is defined.
✓ Database connection successful.
✓ Database version check successful.
> [email protected] update-tracker /app
> node scripts/update-tracker.js
> [email protected] set-routes-manifest /app
> node scripts/set-routes-manifest.js
Using original Next.js routes manifest
node:fs:2426
return binding.writeFileUtf8(
^
Error: EACCES: permission denied, open '/app/.next/routes-manifest.json'
at Object.writeFileSync (node:fs:2426:20)
at Object.<anonymous> (/app/scripts/set-routes-manifest.js:74:4)
at Module._compile (node:internal/modules/cjs/loader:1730:14)
at Object..js (node:internal/modules/cjs/loader:1895:10)
at Module.load (node:internal/modules/cjs/loader:1465:32)
at Function._load (node:internal/modules/cjs/loader:1282:12)
at TracingChannel.traceSync (node:diagnostics_channel:322:14)
at wrapModuleLoad (node:internal/modules/cjs/loader:235:24)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:170:5)
at node:internal/main/run_main_module:36:49 {
errno: -13,
code: 'EACCES',
syscall: 'open',
path: '/app/.next/routes-manifest.json'
}
Node.js v22.15.0
ELIFECYCLE Command failed with exit code 1.
ERROR: "set-routes-manifest" exited with 1.
ELIFECYCLE Command failed with exit code 1.
Which Umami version are you using? (if relevant)
v2.18.0
Which browser are you using? (if relevant)
irrelevant
How are you deploying your application? (if relevant)
Kubernetes via helm chart
Same issue with docker-compose deployments.
Can't reproduce the error on a new docker-compose deployment. @al-lac I'm assuming your is also an upgrade from 2.17 to 2.18?
@franciscao633, happens with both fresh installs and updates for me.
This is the docker-compose.yml I am using (via Umbrel): https://github.com/getumbrel/umbrel-apps/blob/35fded41f7d1b165054d0994364b06186d026bc1/umami/docker-compose.yml
@al-lac Does Umbrel need to run on user 1000:1000? If you remove that line from the Umbrel compose file, it should run fine. Alternatively you need to create permissions to prisma for that user. In the DockerFile we have below
# Permissions for prisma
RUN chown -R nextjs:nodejs node_modules/.pnpm/
using docker (upgrading from 2.17) i am getting a failed start. had to manually revert to v2.17 and then restore the DB from snapshot bc even after i reverted the image line the container wouldnt come up until i changed the DB back to before i tried to pull the upgrade.
@franciscao633 It does not need to run as user 1000:1000, but it was running fine this ways since version 2.12.1.
And indeed, the app is working after I removed the user line.
However, this seems like a step back and a bug that should be fixed.
Upgrading from v2.17. Not sure if it's a related issue.
[email protected] set-routes-manifest /app
node scripts/set-routes-manifest.js
err /app/scripts/set-routes-manifest.js:21
const routeRegex = new RegExp(apiRoute.regex);
^
err TypeError: Cannot read properties of undefined (reading 'regex')
at Object.<anonymous> (/app/scripts/set-routes-manifest.js:21:42)
at Module._compile (node:internal/modules/cjs/loader:1730:14)
at Object..js (node:internal/modules/cjs/loader:1895:10)
at Module.load (node:internal/modules/cjs/loader:1465:32)
at Function._load (node:internal/modules/cjs/loader:1282:12)
at TracingChannel.traceSync (node:diagnostics_channel:322:14)
at wrapModuleLoad (node:internal/modules/cjs/loader:235:24)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:170:5)
at node:internal/main/run_main_module:36:49
Node.js v22.15.0
ELIFECYCLE Command failed with exit code 1.
I have a custom BASE_PATH, COLLECT_API_ENDPOINT and TRACKER_SCRIPT_NAME setup.
@desw0lf Could you please create another issue, sharing more details about your setup? Seems like you're building you own image? (Issue is probably caused by BASE_PATH, I'll have a look at it).
Edit: confirmed, users who build their own images with BASE_PATH will have start issues, I'll submit a fix.
People here have permissions issues, and it's most likely not new btw, high probability that their setups already failed in previous versions if runned with COLLECT_API_ENDPOINT (update-tracker would have failed too).
this is my docker compose and it fails when i try to upgrade to 2.18. its def not a permissions issue:
services:
umami:
image: ghcr.io/umami-software/umami:postgresql-v2.17
#image: ghcr.io/umami-software/umami:postgresql-latest
container_name: umami
ports:
- 3002:3000
environment:
DATABASE_URL: postgresql://umami:umami@db:5432/umami
DATABASE_TYPE: postgresql
APP_SECRET: replace-me-with-a-random-string
depends_on:
db:
condition: service_healthy
restart: unless-stopped
healthcheck:
test:
- CMD-SHELL
- curl http://localhost:3000/api/heartbeat
interval: 5s
timeout: 5s
retries: 5
db:
image: postgres:15-alpine
container_name: umami-db
environment:
POSTGRES_DB: umami
POSTGRES_USER: umami
POSTGRES_PASSWORD: umami
volumes:
- /mnt/bigdeal/configs/umami/umami-db-data:/var/lib/postgresql/data
restart: unless-stopped
healthcheck:
test:
- CMD-SHELL
- pg_isready -U $${POSTGRES_USER} -d $${POSTGRES_DB}
interval: 5s
timeout: 5s
retries: 5
@serversathome, this issue is about permissions errors, you might have another issue then, do you have any logs (docker compose logs)? They would be useful.
@Maxime-J https://pastebin.com/E3NdtGLk
@serversathome Thanks, definitely is another issue, it looks like there was a problem with 09_update_hostname_region migration, but the logs somehow doesn't show all the details. There's an open issue mentioning that migration (#3399), you can have a look at it, team will probably help you :)
@thiagoalmeidasa I had a look on your revert commit, your security context might causing your problems:
securityContext:
[...]
readOnlyRootFilesystem: true
Umami container can't have a read only filesystem.
It was already the case with previous versions if you had COLLECT_API_ENDPOINT (for the update-tracker script).
But now, there's always a need for write operations with the pnpm switch and the routes manifest customization.
Maybe you have the same setting @AlexBartlAA?
I'm also having this problem in the upgrade to 2.18
> [email protected] check-db /app
> node scripts/check-db.js
✓ DATABASE_URL is defined.
✓ Database connection successful.
✓ Database version check successful.
Error: Can't write to /app/node_modules/.pnpm/@[email protected]/node_modules/@prisma/engines please make sure you install "prisma" with the right permissions.
✗ Command failed: prisma migrate deploy
Error: Can't write to /app/node_modules/.pnpm/@[email protected]/node_modules/@prisma/engines please make sure you install "prisma" with the right permissions.
ELIFECYCLE Command failed with exit code 1.
ERROR: "check-db" exited with 1.
ELIFECYCLE Command failed with exit code 1.
Here's my docker compose config, which has been working for over a year:
umami_postgres:
image: postgres:15-alpine
restart: always
container_name: umami_postgres
volumes:
- ${PRIMARY_MOUNT}/postgres_15:/var/lib/postgresql/data
environment:
POSTGRES_DB: umami
POSTGRES_USER: umami
POSTGRES_PASSWORD: umami
networks:
- umami-private-net
healthcheck:
test: ["CMD-SHELL", "pg_isready -U $${POSTGRES_USER} -d $${POSTGRES_DB}"]
interval: 5s
timeout: 5s
retries: 5
umami:
image: ghcr.io/umami-software/umami:postgresql-latest
restart: always
container_name: umami
user: 1000
depends_on:
umami_postgres:
condition: service_healthy
environment:
DATABASE_URL: postgresql://umami:umami@umami_postgres:5432/umami
DATABASE_TYPE: postgresql
APP_SECRET: ${DNS_DOMAIN_ZONE_ID}
labels:
- traefik.enable=true
- traefik.docker.network=umami-net
- traefik.http.services.umami-svc.loadbalancer.server.port=3000
- traefik.http.routers.umami-rtr.rule=Host(`umami.${DNS_DOMAIN}`)
- traefik.http.routers.umami-rtr.entrypoints=websecure
- traefik.http.routers.umami-rtr.tls=true
networks:
- umami-net
- umami-private-net
healthcheck:
test: ["CMD-SHELL", "curl http://localhost:3000/api/heartbeat"]
interval: 5s
timeout: 5s
retries: 5
Solution
https://github.com/umami-software/umami/commit/340cdce1dcc0504916d841d21d9585f9c8939331
Those chowns are the problem. If I remove my user line, the server starts up. Kubernetes is likely changing the execution user as well. It would be nice if you could run as whatever user you wanted, and I don't really understand why the application wants to write into node_modules/.pnpm after this update but I don't know PNPM very well so it's probably more related to PNPM than prisma.
Anyway, make sure you're running as 1001 and this should go away. I think this is a bug in the umami docker image, but that's the workaround.
Thanks for your inputs! I've tested some more based on the discussion here. From what I can tell, v2.18 requires to run as root, any non-root user ID will fail. Running the container as root isn't allowed in our cluster and for good reason.
So I guess the minimal reproducible example is to use the docker-compose from the umami repo and just add user: "1000:1000" line to the umami service to run as a non-root user.
Should I rename this ticket to v2.18 docker image only runs as root, now that we know that this is a better summary of the issue?
It's not exactly that, to resume:
-
Umami Dockerfile ensures that it's run as a non root user (nextjs user 1001:1001). I'm not the author but it's a standard and good practice. It works as expected if you don't override user config in your compose or kubernetes config file.
-
If a custom config was apparently working before, it wasn't completely. In the sense it would have already failed if
COLLECT_API_ENDPOINTwas set. -
Umami container can't be read only. A kubernetes config with
readOnlyRootFilesystem: truein securityContext will fail.
As to why write operations are needed:
-For eventual COLLECT_API_ENDPOINT and TRACKER_SCRIPT_NAME customizations,
tracker script file might need to be updated,
and the according Next.js manifest file is written (either the original or an updated one)
-I didn't have a deep look into it, but it looks like the engine used by the ORM isn't included anymore with the pnpm switch. So it has to be downloaded.
Engine + routes manifest writes are the new things in 2.18 which bring up permissions issues in custom configs.
Ah, sorry, then I missummarized. So the dockerfile comes with a hardcoded 1001 user ID and whenever the user ID is set externally, it needs to match that. I can confirm that with 1001:1001 I don't run into the same issue.
Thanks for your support.
Thanks for the insights @Maxime-J!
Also works for us now by setting the user to 1001.
@subdavis , regarding your error:
Error: Can't write to /app/node_modules/.pnpm/@[email protected]/node_modules/@prisma/engines please make sure you install "prisma" with the right permissions.
I got the same and created a separate issue for it (https://github.com/umami-software/umami/issues/3422). For some Kubernetes distros, it is not possible or wanted to run as a specific user (e.g. OpenShift), so for them it is not possible to set the user to 1001, the user id will be "random". My solution to this write permission error is:
- Created an init container (with umami 2.18.1 image) with an empty dir at location
/tmp/pnpm-node-modules/ - Changed args in init container to:
[cp, "-r", /app/node_modules/.pnpm/., /tmp/pnpm-node-modules/] - Mounted the same empty dir in the main Umami container at mount path
/app/node_modules/.pnpm/
I have investigated whether it is possible to configure the path Prisma uses when downloading its engine at runtime, but this doesn't seem to be possible.
But when that was fixed, I ran into the problem mentioned in the first post here, Error: EACCES: permission denied, open '/app/.next/routes-manifest.json', and are currently stuck, still on v2.17.0.
The problem occurs because the file /app/.next/routes-manifest.json is created at runtime by the script scripts/set-routes-manifest.js, which means that the user must be 1001 (which is the user that has write permission). Some environments (like OpenShift, see explanation) doesn't allow to specify user ID, and Umami is not compatible with those environments without modifications. This is the same problem with Prisma (as also mentioned in this thread), which downloads engine binaries at runtime when doing DB migration only allowed by the 1001 user.
The runtime creation of /app/.next/routes-manifest.json was implemented in https://github.com/umami-software/umami/commit/b88432fcf4c030ab582f005e4d29a80a1416509d, and was first deployed in v2.18.0.
@mikaello Given your OpenShift setup, you could build you own image adding this instruction in Dockerfile before the USER line:
RUN chgrp -R 0 /app && \
chmod -R g=u /app
(as per OpenShift docs)
It could be more fine-grained, but it would work.
OR going your way, you might as well copy the entire app folder, I guess it would work too. Edited quote:
Created an init container (with umami 2.18.1 image) with an empty dir at location /tmp/umami-app/ Changed args in init container to: [cp, "-r", /app/., /tmp/umami-app/] Mounted the same empty dir in the main Umami container at mount path /app/
This issue is stale because it has been open for 60 days with no activity.