nginx-proxy-manager icon indicating copy to clipboard operation
nginx-proxy-manager copied to clipboard

Container will not restart due to chmod operation not permitted. Only works after a fresh install.

Open AlvaroMartinezB opened this issue 1 year ago • 22 comments

Checklist

  • Have you pulled and found the error with jc21/nginx-proxy-manager:latest docker image?
    • Yes
  • Are you sure you're not using someone else's docker image?
    • Yes
  • Have you searched for similar issues (both open and closed)?
    • Yes

Describe the bug

The NPM container will not start after restarting the docker engine for any reason. It works fine on a fresh install, but fails to start after any restart. The error happens during "Setting ownership ..."

2.12.1 (lastest)

To Reproduce Steps to reproduce the behavior:

  1. Configure an NPM instance. Log in, setup credentials, configure proxy host with let's encrypt SSL.
  2. Restart docker engine (for any reason, like updates or applying a memory size change)
  3. Try to start NPM
  4. Observe error in terminal logs

Expected behavior

NPM should start with no issues.

Screenshots

Operating System

Ubuntu 24.04.1

Additional context

Here are the terminal logs:

npm-app-1 | ❯ Configuring npm user ... npm-app-1 | 0 npm-app-1 | usermod: no changes npm-app-1 | ❯ Configuring npm group ... npm-app-1 | ❯ Checking paths ... npm-app-1 | ❯ Setting ownership ... npm-app-1 | chown: changing ownership of '/etc/letsencrypt/live/npm-1/cert.pem': Operation not permitted npm-app-1 | chown: changing ownership of '/etc/letsencrypt/live/npm-1/chain.pem': Operation not permitted npm-app-1 | chown: changing ownership of '/etc/letsencrypt/live/npm-1/fullchain.pem': Operation not permitted npm-app-1 | chown: changing ownership of '/etc/letsencrypt/live/npm-1/privkey.pem': Operation not permitted npm-app-1 | chown: changing ownership of '/etc/letsencrypt/live/npm-2/cert.pem': Operation not permitted npm-app-1 | chown: changing ownership of '/etc/letsencrypt/live/npm-2/chain.pem': Operation not permitted npm-app-1 | chown: changing ownership of '/etc/letsencrypt/live/npm-2/fullchain.pem': Operation not permitted npm-app-1 | chown: changing ownership of '/etc/letsencrypt/live/npm-2/privkey.pem': Operation not permitted npm-app-1 | s6-rc: warning: unable to start service prepare: command exited 1 npm-app-1 | /run/s6/basedir/scripts/rc.init: warning: s6-rc failed to properly bring all the services up! Check your logs (in /run/uncaught-logs/current if you have in-container logging) for more information.

AlvaroMartinezB avatar Dec 03 '24 04:12 AlvaroMartinezB

Same issue here except I can't even install it. It just won't run at all

TimelessFun avatar Dec 03 '24 04:12 TimelessFun

Same here. Work around for me was to change permissions of the letsencrypt/live directory on the host itself.

sudo chmod -R 775 letsencrypt_dir/live

Survives container restarts after that.

suprexus avatar Dec 05 '24 04:12 suprexus

I have the same issue. sudo chmopd -R 775 /live doesn't seem to work for me

sonpeter88 avatar Dec 16 '24 19:12 sonpeter88

Check your path - /live doesn't seem correct. Check your docker config to see where you're saving the letsencrypt files.

sudo chmod -R 775 letsencrypt_dir/live

suprexus avatar Dec 16 '24 19:12 suprexus

The path seems correct. I just ommitted putting in letsencrypt_dir since it's different for everyone. the live folder has the application folder with pem files within it.

the files have rwx permissions

sonpeter88 avatar Dec 16 '24 19:12 sonpeter88

Gotcha - What user owns the directory? If you're specifying a PUID in your docker compose file, it should be the same user.

suprexus avatar Dec 16 '24 19:12 suprexus

I actually tried specifying user/group (1000:1000) in my docker compose but it got angry that it has to run as root so i assume my docker is already running as root.

ls -al in letsencrypt_dir/live/npm-3 folder shows

lrwxrwxrwx1 peter peter which is uid=1000 and gid=1000

sonpeter88 avatar Dec 16 '24 19:12 sonpeter88

In the same boat as @sonpeter88 ... sudo chmod -R 775 /path/to/letsencrypt/live in my host machine doesn't seem to work either :disappointed: .

jaysee260 avatar Dec 17 '24 16:12 jaysee260

okay I figured out a way to get it to work. The files in live folder are symlinks to the ../../archive/ folders and aren't happy when you try to chmod them on my ubuntu machine. I deleted all the symlinks in the live folder, and directly copied the pem files from the archive folder. Once I've done that and gave the files the correct permission, the container was able to restart fine. Thanks for the tip @suprexus

sonpeter88 avatar Dec 18 '24 22:12 sonpeter88

okay I figured out a way to get it to work. The files in live folder are symlinks to the ../../archive/ folders and aren't happy when you try to chmod them on my ubuntu machine. I deleted all the symlinks in the live folder, and directly copied the pem files from the archive folder. Once I've done that and gave the files the correct permission, the container was able to restart fine. Thanks for the tip @suprexus

heh, really nice catch @sonpeter88 ... thanks, I tried it and this works!

jaysee260 avatar Dec 19 '24 02:12 jaysee260

Hate to report this, but this doesn't really seem to be a stable/lasting solution...

  1. When you try to renew the LetsEncrypt cert, you get an error saying symlinks were expected on the *.pem files.
  2. I suddenly started getting an issue where I could just not login into the Nginx Manager portal as my admin user.

I'm sure I was using the right credentials, I have them written down. Would straight up just fail authentication. I got so frustrated I just restarted everything from scratch. If it happens again, I'll capture logs and share.

Anyway, this seems like a temporary workaround at best.

jaysee260 avatar Dec 20 '24 02:12 jaysee260

I didn't change the symlinks - What I probably did do was change the permissions on the entire npm directory to include the ../../archive/. Try recreating the symlinks and chmod the entire directory.

suprexus avatar Dec 20 '24 16:12 suprexus

@suprexus -sigh- that doesn't seem to work, still getting

2024-12-21 10:09:57 ❯ Configuring npm user ...
2024-12-21 10:09:57 ❯ Configuring npm group ...
2024-12-21 10:09:58 ❯ Checking paths ...
2024-12-21 10:09:58 ❯ Setting ownership ...
2024-12-21 10:09:57 useradd warning: npm's uid 0 outside of the UID_MIN 1000 and UID_MAX 60000 range.
2024-12-21 10:09:58 chown: changing ownership of '/etc/letsencrypt/live/npm-1/chain.pem': Operation not permitted
2024-12-21 10:09:58 chown: changing ownership of '/etc/letsencrypt/live/npm-1/cert.pem': Operation not permitted
2024-12-21 10:09:58 chown: changing ownership of '/etc/letsencrypt/live/npm-1/fullchain.pem': Operation not permitted
2024-12-21 10:09:58 chown: changing ownership of '/etc/letsencrypt/live/npm-1/privkey.pem': Operation not permitted
2024-12-21 10:09:58 s6-rc: warning: unable to start service prepare: command exited 1
2024-12-21 10:09:58 /run/s6/basedir/scripts/rc.init: warning: s6-rc failed to properly bring all the services up! Check your logs (in /run/uncaught-logs/current if you have in-container logging) for more information.

Even went as far as making sure I set the proper permissions to the entire letsencrypt directory with sudo chmod -R 775 /path/to/letsencrypt.

The only thing that's "worked" so far is @sonpeter88's suggestion to remove the symlinks and copy the *.pem files from ../../archive, but as pointed out earlier, that solution is unstable, and causes other issues.

Will keep digging...

jaysee260 avatar Dec 21 '24 15:12 jaysee260

What permissions are set on ../../archive/npm-1/

ls -la archive/npm-1/

are you running the container as a certain user?

suprexus avatar Dec 21 '24 18:12 suprexus

What permissions are set on ../../archive/npm-1

$ ll archive/npm-1/
total 48
drwxrwxr-x 2 jc jc 4096 Dec 19 21:27 .
drwxrwxr-x 3 jc jc 4096 Dec 19 21:27 ..
-rwxrwxr-x 1 jc jc 1314 Dec 19 21:27 cert1.pem
-rwxrwxr-x 1 jc jc 1566 Dec 19 21:27 chain1.pem
-rwxrwxr-x 1 jc jc 2880 Dec 19 21:27 fullchain1.pem
-rwxrwxr-x 1 jc jc  306 Dec 19 21:27 privkey1.pem

are you running the container as a certain user?

not another user than the one I usually use... as far as I'm aware? same user I am running docker compose up -d with.

I ran docker inspect $(docker ps -q), and under the Config section of this container, I see the following for "User"

"Config": {
            "Hostname": "5f7a21d6b616",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": true,
            "AttachStderr": true,
           .....
}

jaysee260 avatar Dec 22 '24 03:12 jaysee260

So, I was able to confirm this ONLY happens if you generate an SSL certificate. Otherwise, you can restart, destroy/recreate the container all day long, and it runs fine.

I can't quite figure out what it is about the container restarting or getting recreated that causes the cert files to become inaccessible...

Line 13 of this script is what changes the permissions of the letsencrypt directory (inside the container) on container start up... Why does it only work the first time? :thinking:

jaysee260 avatar Dec 22 '24 04:12 jaysee260

I was able to "solve the issue" by simply using jlesage/docker-nginx-proxy-manager. Apparently the only real differences are using ports 8080, 8181, and 4443, GUID/UID support that predates the main branch, and nginx runs as non root on non privileged ports.

mr-prez avatar Jan 01 '25 05:01 mr-prez

Line 13 of this script is what changes the permissions of the letsencrypt directory (inside the container) on container start up... Why does it only work the first time? 🤔

So in order to get this to work for me i had to edit the file referenced above and comment out the chmod. once i did that it came up normally and i was able to renew a cert without issue.

dshizntpdt avatar Jan 24 '25 23:01 dshizntpdt

Was this resolved? I changed the symbolic links "ln -s" to hard links "ln". It starts now and not sure if it will work through a cert renewal.

dmorsberger avatar Mar 29 '25 19:03 dmorsberger

Had the same problem after running docker system prune and the only way to fix it was

@sonpeter88's suggestion to remove the symlinks and copy the *.pem files from ../../archive, but as pointed out earlier, that solution is unstable, and causes other issues.

mateus-werneck avatar Jun 11 '25 12:06 mateus-werneck

I was able to get around the error by modifying and overriding /etc/s6-overlay/s6-rc.d/prepare/30-ownership.sh

Make a local copy by extracting from the Docker container

30-ownership.sh: Use the 'find' command to skip symbolic links when changing ownership

# npm user and group
chown -R "$PUID:$PGID" /data
find /etc/letsencrypt -not -type l -exec chown -R "$PUID:$PGID" {} \;
chown -R "$PUID:$PGID" /run/nginx
chown -R "$PUID:$PGID" /tmp/nginx
chown -R "$PUID:$PGID" /var/cache/nginx
chown -R "$PUID:$PGID" /var/lib/logrotate
chown -R "$PUID:$PGID" /var/lib/nginx
chown -R "$PUID:$PGID" /var/log/nginx

docker-compose.yaml: Overwrite 30-ownership.sh with the bind command

volumes:
      - ./data:/data
      - ./letsencrypt:/etc/letsencrypt
      - ./my-30-ownership.sh:/etc/s6-overlay/s6-rc.d/prepare/30-ownership.sh

dmorsberger avatar Jun 11 '25 13:06 dmorsberger

I was able to get around the error by modifying and overriding /etc/s6-overlay/s6-rc.d/prepare/30-ownership.sh

Make a local copy by extracting from the Docker container

30-ownership.sh: Use the 'find' command to skip symbolic links when changing ownership

# npm user and group
chown -R "$PUID:$PGID" /data
find /etc/letsencrypt -not -type l -exec chown -R "$PUID:$PGID" {} \;
chown -R "$PUID:$PGID" /run/nginx
chown -R "$PUID:$PGID" /tmp/nginx
chown -R "$PUID:$PGID" /var/cache/nginx
chown -R "$PUID:$PGID" /var/lib/logrotate
chown -R "$PUID:$PGID" /var/lib/nginx
chown -R "$PUID:$PGID" /var/log/nginx

docker-compose.yaml: Overwrite 30-ownership.sh with the bind command

volumes:
      - ./data:/data
      - ./letsencrypt:/etc/letsencrypt
      - ./my-30-ownership.sh:/etc/s6-overlay/s6-rc.d/prepare/30-ownership.sh

Any chance of this getting merged onto the main branch? I have no idea how to do what you described haha

AlvaroMartinezB avatar Jun 13 '25 23:06 AlvaroMartinezB

IT HAS BEEN FIXED!!! Just updated to v2.12.6, because I saw in the changelog that they skipped the Certbot thing... I just updated and my container was able to start again and it's working. Thank you to whoever fixed it!!!

AlvaroMartinezB avatar Aug 01 '25 20:08 AlvaroMartinezB