compose icon indicating copy to clipboard operation
compose copied to clipboard

Wait or retry starting a container waiting for NFS volume to be mounted

Open hadim opened this issue 4 years ago • 5 comments

I have a bunch of services that depend on multiple Docker-compose-declared NFS volumes. All volumes depend on another machine running a single NFS server.

Upon power failure botch machines (the one running the NFS server and the ones running Docker compose services) are restarted and the machine running the NFS server is much slower to start than the other machine.

Docker services are correctly started upon machine reboot but most services fail to start because the NFS server is still down. Meaning than I need to manually restart them once the NFS server is up.

I was wondering if I could set a setting that tries to restart failed services or another config that will wait for the NFS volumes to be mounted. Or maybe a depends option that will depend on a volume to be ready (instead of a service).

hadim avatar Jul 19 '20 16:07 hadim

I'm currently handling the same issue with a 'not so clean' solution, a retry bash script:

while [ "$( docker container inspect -f '{{.State.Status}}' $container_name )" != "running" ]; do

Try a docker compose up -d here, will fail if the NFS share volume is not available sleep 120

done

A 'depends' option on a volume would be great!

Kianda avatar Aug 03 '20 09:08 Kianda

Hello please I'm having the same issue my portainer container doesn't start until my volumes are not ready, how do you solve these?

gsi-yisel avatar Sep 01 '20 21:09 gsi-yisel

Hello, a year after, having the same issue when rebooting the system. The containers using NFS mounts fail to start, exit with 137 error and do not restart regardless restart: unless-stopped option. Does anyone have a best practice advice to share?

mickaeltardy avatar Oct 30 '21 16:10 mickaeltardy

I ran into the same issue and thought a Docker feature was also the way to go, but ended up realizing that the best solution is in systemd (imo). Look in the output of: systemctl list-units | grep mount (or systemctl list-units | grep /mnt - depending where your stuff is mounted) to get the name of the unit of your NFS share (something.mount).

Then edit Docker's systemd config: /etc/systemd/system/multi-user.target.wants/docker.service

And append the mount(s)'s unit name(s) to the After and Wants lines.

For example:

$ systemctl list-units | grep mnt
  mnt-docker_persistent_highperf.mount                                                                        loaded active mounted   /mnt/docker_persistent_highperf
  mnt-docker_persistent_volumes.mount                                                                         loaded active mounted   /mnt/docker_persistent_volumes
  mnt-media.mount                                                                                             loaded active mounted   /mnt/media
# I'll show only the first 5 lines of the file here but don't delete anything from the file!
$ head -5 /etc/systemd/system/multi-user.target.wants/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service containerd.service mnt-docker_persistent_highperf.mount mnt-docker_persistent_volumes.mount mnt-media.mount
Wants=network-online.target mnt-docker_persistent_highperf.mount mnt-docker_persistent_volumes.mount mnt-media.mount

Then run: $ sudo systemctl daemon-reload

alxandr3 avatar Mar 03 '22 22:03 alxandr3

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Sep 21 '22 10:09 stale[bot]

Same with CIFS, there will be any fix? Using the unless-stoped

deinok avatar Sep 23 '22 18:09 deinok

Docker compose is not involved in container lifecycle being restarted after a reboot. So it can't be involved in this scenario to help you make application resilient.

maybe a depends option that will depend on a volume to be ready

Compose relies on docker engine API, and there's no such thing like a "health" flag for a volume.

Using system configuration to enforce NFS server (or comparable) required resources to be healthy before docker daemon is started seems a clean workaround. Otherwise, please report this scenario to https://github.com/moby/moby, maybe the engine's volume plugin API could be extended to add support for it.

ndeloof avatar May 03 '23 12:05 ndeloof