`stack deploy --detach=false` will hang forever if a service is completed.
Description
Deploying a stack and waiting until all services converge using --detach=false will block forever if the service already completed and is not configured to restart.
Reproduce
Consider the following compose.yaml file (modified from #4907):
services:
a_service:
image: nginx:alpine
healthcheck:
test: ["CMD", "sh", "-c", "if [ ! -f \"/count\" ] ; then ctr=0; else ctr=`cat /count`; fi; ctr=`expr $${ctr} + 1`; echo \"$${ctr}\" > /count; if [ \"$$ctr\" -gt 4 ] ; then exit 0; else exit 1; fi"]
interval: 10s
timeout: 3s
retries: 3
start_period: 60s
deploy:
replicas: 3
b_service:
image: hello-world
deploy:
restart_policy:
condition: on-failure
c_service:
image: nginx:alpine
Deploy this stack with:
docker stack deploy -c compose.yaml test --detach=false
The b_service will exit immediately with exit code 0 and will not restart because of the restart_policy. However the deploy command will try to wait until b_service converges but this will block forever.
Expected behavior
I expect, that the the service should be considered as completed and keep going or fail with an appropriate error message.
docker version
Client:
Version: 27.0.3
API version: 1.46
Go version: go1.21.11
Git commit: 7d4bcd8
Built: Sat Jun 29 00:01:25 2024
OS/Arch: linux/amd64
Context: default
Server: Docker Desktop
Engine:
Version: 27.0.3
API version: 1.46 (minimum version 1.24)
Go version: go1.21.11
Git commit: 662f78c
Built: Sat Jun 29 00:02:50 2024
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.7.18
GitCommit: ae71819c4f5e67bb4d5ae76a6b735f29cc25774e
runc:
Version: 1.7.18
GitCommit: v1.1.13-0-g58aa920
docker-init:
Version: 0.19.0
GitCommit: de40ad0
docker info
Client:
Version: 27.0.3
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.15.1-desktop.1
Path: /usr/local/lib/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.28.1-desktop.1
Path: /usr/local/lib/docker/cli-plugins/docker-compose
debug: Get a shell into any image or container (Docker Inc.)
Version: 0.0.32
Path: /usr/local/lib/docker/cli-plugins/docker-debug
dev: Docker Dev Environments (Docker Inc.)
Version: v0.1.2
Path: /usr/local/lib/docker/cli-plugins/docker-dev
extension: Manages Docker extensions (Docker Inc.)
Version: v0.2.25
Path: /usr/local/lib/docker/cli-plugins/docker-extension
feedback: Provide feedback, right in your terminal! (Docker Inc.)
Version: v1.0.5
Path: /usr/local/lib/docker/cli-plugins/docker-feedback
init: Creates Docker-related starter files for your project (Docker Inc.)
Version: v1.3.0
Path: /usr/local/lib/docker/cli-plugins/docker-init
sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
Version: 0.6.0
Path: /usr/local/lib/docker/cli-plugins/docker-sbom
scout: Docker Scout (Docker Inc.)
Version: v1.10.0
Path: /usr/local/lib/docker/cli-plugins/docker-scout
Server:
Containers: 29
Running: 16
Paused: 0
Stopped: 13
Images: 37
Server Version: 27.0.3
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
Swarm: active
NodeID: kl8ei4nj5x6lxjgklexl795fy
Is Manager: true
ClusterID: xvjz7f3oio01p8lbsv2nqb9df
Managers: 1
Nodes: 1
Default Address Pool: 10.0.0.0/8
SubnetSize: 24
Data Path Port: 4789
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Number of Old Snapshots to Retain: 0
Heartbeat Tick: 1
Election Tick: 10
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Force Rotate: 0
Autolock Managers: false
Root Rotation In Progress: false
Node Address: 192.168.65.3
Manager Addresses:
192.168.65.3:2377
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: ae71819c4f5e67bb4d5ae76a6b735f29cc25774e
runc version: v1.1.13-0-g58aa920
init version: de40ad0
Security Options:
seccomp
Profile: unconfined
Kernel Version: 5.15.153.1-microsoft-standard-WSL2
Operating System: Docker Desktop
OSType: linux
Architecture: x86_64
CPUs: 12
Total Memory: 15.62GiB
Name: docker-desktop
ID: 4de13e93-c266-468b-b309-5af40da2a80f
Docker Root Dir: /var/lib/docker
Debug Mode: false
HTTP Proxy: http.docker.internal:3128
HTTPS Proxy: http.docker.internal:3128
No Proxy: hubproxy.docker.internal
Username: scadev
Labels:
com.docker.desktop.address=unix:///var/run/docker-cli.sock
Experimental: false
Insecure Registries:
hubproxy.docker.internal:5555
127.0.0.0/8
Live Restore Enabled: false
WARNING: No blkio throttle.read_bps_device support
WARNING: No blkio throttle.write_bps_device support
WARNING: No blkio throttle.read_iops_device support
WARNING: No blkio throttle.write_iops_device support
WARNING: daemon is not using the default seccomp profile
Additional Info
I encountered this issue while using a init service, which only exists to configure another service and then exit. A fix for this issue would be to set the deploy mode to replicated-job or global-job. Unfortunately these options are not documented here: https://docs.docker.com/compose/compose-file/deploy/#mode.
It's documented now on docker website.
there a solution for this situation, you can change deploy mode to replicated-job on which service stops after completes this tasks.
for more information check this: https://docs.docker.com/reference/compose-file/deploy/#mode
there a solution for this situation, you can change deploy mode to
replicated-jobon which service stops after completes this tasks.
replicated-job is unusable because of bugs: https://github.com/moby/moby/issues/42741, https://github.com/moby/moby/issues/42742.
I ran into this issue today. I have a one-time job service that creates resources via an API using a Docker image — it’s a really convenient setup. However, due to this issue, I can’t use it properly and need to find workarounds.
Having the option to tell the stack not to wait for a specific service would be very helpful.
I have also encounter same issue with --detach=false the docker stack hung forever if stack already deployed and running.
@ambroslins @thaJeztah @vvoland