swarmkit
swarmkit copied to clipboard
docker service update constraint-add <svc> issue
With a Swarm cluster consisting of a master and 2 worker nodes constraints added to place a service replicas on both manager and worker get added. How could a service replica be allocated to a node if constraints are added to place only on manager and only on worker. As an example:
- Start with 2 replicas for a service 'mysql' without any constraints. Replicas are allocated without any constraint.
core@ip-10-0-0-238 ~ $ docker service ps -f desired-state=running mysql
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR
bd4aw5mxijamjk94emr7uokd0 mysql.1 mysql:latest ip-10-0-0-58.ec2.internal Running Running 8 minutes ago
aq92faql778zbepzl7gldktne mysql.2 mysql:latest ip-10-0-0-140.ec2.internal Running Running 8 minutes ago
-
Scale service to 3 replicas.
-
Add constraint to only place replicas on 'manager'.
core@ip-10-0-0-238 ~ $ docker service update --constraint-add 'node.role==manager' mysql
` All service replicas get placed on 'manager'.
core@ip-10-0-0-238 ~ $ docker service ps -f desired-state=running mysql
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR
a8sxircmcl0068owwb14719yu mysql.1 mysql:latest ip-10-0-0-238.ec2.internal Running Running 10 seconds ago
al4cnixheuy7ww07w2b7hudfc mysql.2 mysql:latest ip-10-0-0-238.ec2.internal Running Running 36 seconds ago
8y3lm96begonntdr2of2104kl mysql.3 mysql:latest ip-10-0-0-238.ec2.internal Running Running 23 seconds ago
- Scale to 10 replicas and all replicas are on 'manager' node as expected.
core@ip-10-0-0-238 ~ $ docker service ps -f desired-state=running mysql
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR
a8sxircmcl0068owwb14719yu mysql.1 mysql:latest ip-10-0-0-238.ec2.internal Running Running 2 minutes ago
al4cnixheuy7ww07w2b7hudfc mysql.2 mysql:latest ip-10-0-0-238.ec2.internal Running Running 2 minutes ago
8y3lm96begonntdr2of2104kl mysql.3 mysql:latest ip-10-0-0-238.ec2.internal Running Running 2 minutes ago
8wvcmcap3ra2f6fd7lk9w29nb mysql.6 mysql:latest ip-10-0-0-238.ec2.internal Running Running 20 seconds ago
4m7bzl1ra6km6mabb8bhm4e1t mysql.7 mysql:latest ip-10-0-0-238.ec2.internal Running Running 12 seconds ago
56dlt2jmhi91cc5kum4f9thwi mysql.8 mysql:latest ip-10-0-0-238.ec2.internal Running Running 12 seconds ago
6ha3b20l2dlufk659htbiwqas mysql.9 mysql:latest ip-10-0-0-238.ec2.internal Running Running 12 seconds ago
c4ddz2aw9jfued1665zou312m mysql.10 mysql:latest ip-10-0-0-238.ec2.internal Running Running 21 seconds ago
- Add constraint to place service replicas on worker.
docker service update --constraint-add 'node.role==worker' mysql
The result should be that no replica should be running as a replica cannot be both on the 'manager' and 'worker' nodes. But the result is that some of the replicas are listed as "Allocated" but without any node on which placed and some of the replicas are still running on 'manager'.
core@ip-10-0-0-238 ~ $ docker service ps -f desired-state=running mysql
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR
2vrcbtpn5bz3r86rlj2gneffm mysql.1 mysql:latest Running Allocated 3 minutes ago
al4cnixheuy7ww07w2b7hudfc mysql.2 mysql:latest ip-10-0-0-238.ec2.internal Running Running 6 minutes ago
8y3lm96begonntdr2of2104kl mysql.3 mysql:latest ip-10-0-0-238.ec2.internal Running Running 6 minutes ago
3u8yt7oqe6kgxie3pjx54bdgr mysql.4 mysql:latest Running Allocated 3 minutes ago
5e1okgodw0zktn2hilmhr5qa4 mysql.5 mysql:latest Running Allocated 3 minutes ago
47xgfvmzcskomx67n0ji1wfor mysql.6 mysql:latest Running Allocated 3 minutes ago
7ziykb2o0p73hdxc8ie4pu8e2 mysql.7 mysql:latest Running Allocated 3 minutes ago
1to2xdaw3zv6qlr2j60fn1s76 mysql.8 mysql:latest Running Allocated 2 minutes ago
cijq7pg3l6kvvp4mdcmuc0ci5 mysql.9 mysql:latest Running Allocated 2 minutes ago
a94v71crwlv60mt6giocg0tv9 mysql.10 mysql:latest Running Allocated 3 minutes ago
core@ip-10-0-0-238 ~ $
Assuming the other two replicas would also have shutdown and listed as 'Allocated" after a few more minutes, if both the constraints are removed, all replicas get placed and are Running distributed across the nodes in the cluster.
core@ip-10-0-0-238 ~ $ docker service ps -f desired-state=running mysql
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR
34rflltueyekp4httotm7x5ur mysql.1 mysql:latest ip-10-0-0-140.ec2.internal Running Running 7 minutes ago
7defu1n1vgkp9rwzogwjwqvpf mysql.2 mysql:latest ip-10-0-0-238.ec2.internal Running Running 44 seconds ago
58mzqk4zof6x77jsa7v8h23w4 mysql.3 mysql:latest ip-10-0-0-140.ec2.internal Running Running 2 minutes ago
da5y55an9la5ah0l2ounimh2n mysql.4 mysql:latest ip-10-0-0-140.ec2.internal Running Running 5 minutes ago
clao5lkper1faksuo8uk24dje mysql.5 mysql:latest ip-10-0-0-58.ec2.internal Running Running 3 minutes ago
dk6sc736kprvphexbmgpqjeso mysql.6 mysql:latest ip-10-0-0-58.ec2.internal Running Running 4 minutes ago
ecnycghcn0tbj8mn1252tvrrz mysql.7 mysql:latest ip-10-0-0-238.ec2.internal Running Running 2 minutes ago
0bbsfgxzkmrh23avr6o35s4qz mysql.8 mysql:latest ip-10-0-0-58.ec2.internal Running Running 8 minutes ago
bpgkc3fxsl4ecm9ezq9q17aii mysql.10 mysql:latest ip-10-0-0-238.ec2.internal Running Running 58 seconds ago
On which node are the service replicas "Allocated" as listed but not Running if the only nodes are the manager and the worker role nodes?
On which node are the service replicas "Allocated" as listed but not Running
@dvohra when a task from a replicated service is in Allocated
state, it has not been Assigned
to any node. So the NODE
field in docker service ps
shows empty. These tasks are in the hand of manager, not at any node.
What's your UpdateConfig
from docker service inspect mysql --pretty
? I see the update continues when the tasks stuck at Allocated
. It seems you allow update to continue disregard of update is success or not.
With both constraints added the result is not consistent. Earlier all but 2 of the 10 replicas stopped running within 5 minutes. When tested again, all but two replicas are running even after more than 10 minutes, started with 10 replicas.
core@ip-10-0-0-238 ~ $ docker service ps -f desired-state=running mysql
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR
314xi0et7v7q306dxa2dav8kr mysql.1 mysql:latest Running Allocated 9 minutes ago
4gtdsi8n5lb42tzedm1ltiyb6 mysql.2 mysql:latest Running Allocated 8 minutes ago
csjvs4ci8km86fewrq723nkf3 mysql.3 mysql:latest ip-10-0-0-140.ec2.internal Running Running about an hour ago
da5y55an9la5ah0l2ounimh2n mysql.4 mysql:latest ip-10-0-0-140.ec2.internal Running Running 2 hours ago
6g4v90aoo9r0itzzu7b7nvoyt mysql.5 mysql:latest ip-10-0-0-238.ec2.internal Running Running about an hour ago
dk6sc736kprvphexbmgpqjeso mysql.6 mysql:latest ip-10-0-0-58.ec2.internal Running Running about an hour ago
ecnycghcn0tbj8mn1252tvrrz mysql.7 mysql:latest ip-10-0-0-238.ec2.internal Running Running about an hour ago
bsxi3pqwhnjcdkofkgtqmif69 mysql.8 mysql:latest ip-10-0-0-58.ec2.internal Running Running about an hour ago
e07fajyd1ogqvin1o8cg1v9tq mysql.9 mysql:latest ip-10-0-0-58.ec2.internal Running Running about an hour ago
7ub4vw46afmgn5suvm9wcma6y mysql.10 mysql:latest ip-10-0-0-238.ec2.internal Running Running about an hour ago
core@ip-10-0-0-238 ~ $ docker service inspect mysql
[
{
"ID": "2jvr9t2tl7ovxa6o5zt7jhh8m",
"Version": {
"Index": 2320
},
"CreatedAt": "2017-02-01T18:10:28.490900302Z",
"UpdatedAt": "2017-02-01T21:30:48.294171833Z",
"Spec": {
"Name": "mysql",
"TaskTemplate": {
"ContainerSpec": {
"Image": "mysql:latest",
"Env": [
"MYSQL_ROOT_PASSWORD=mysql"
]
},
"Resources": {
"Limits": {},
"Reservations": {}
},
"RestartPolicy": {
"Condition": "any",
"MaxAttempts": 0
},
"Placement": {
"Constraints": [
"node.role==worker",
"node.role==manager"
]
}
},
"Mode": {
"Replicated": {
"Replicas": 10
}
},
"UpdateConfig": {
"Parallelism": 1,
"Delay": 10000000000,
"FailureAction": "pause"
},
"EndpointSpec": {
"Mode": "vip"
}
},
"Endpoint": {
"Spec": {}
},
"UpdateStatus": {
"State": "updating",
"StartedAt": "2017-02-01T21:30:48.294165833Z",
"CompletedAt": "1970-01-01T00:00:00Z",
"Message": "update in progress"
}
}
]
core@ip-10-0-0-238 ~ $
Still, all but two replicas are running even after more than 20 minutes. How could replicas be running with the placement constraints not being applied?
Constraints": [
"node.role==worker",
"node.role==manager"
]
Is some AND, OR used for placement with multiple constraints?
It's because update
has not finished.
"UpdateStatus": {
"State": "updating",
"StartedAt": "2017-02-01T21:30:48.294165833Z",
"CompletedAt": "1970-01-01T00:00:00Z",
"Message": "update in progress"
}
I think it better to fail this update with timeout mechanism instead of letting it stuck at "updating". What's docker version
on your nodes?
cc @aaronlehmann.
core@ip-10-0-0-238 ~ $ docker version
Client:
Version: 1.12.6
API version: 1.24
Go version: go1.6.3
Git commit: d5236f0
Built: Tue Jan 31 07:56:17 2017
OS/Arch: linux/amd64
Server:
Version: 1.12.6
API version: 1.24
Go version: go1.6.3
Git commit: d5236f0
Built: Tue Jan 31 07:56:17 2017
OS/Arch: linux/amd64
How are multiple constraints applied?
In my test on current Docker master (1.14.0-dev), the update only takes down Parallelism
(default to 1) replicas, which protects the service from failing.
Multiple constraints apply as "&&". In your case it is "node.role==worker && node.role==manager". We might extend it later though.
As role cannot be both manager and worker replicas should not be running with both roles applied as constraints.
@dvohra that's correct. If you put these constraints on service create, no instance would advance to running
.
With Parallelism as 1 two replicas would be made non running. But in an earlier run all but two (8 out of 10) replicas were made non running.
Shouldn't update be the same. At least have some consistency.
If you put these constraints on service create, no instance would advance to running.
Got the result indicated. But stays Allocated. No provision to set timeout.
I suspect this may be fixed in 1.13.0 by https://github.com/docker/swarmkit/pull/1612
Actually that's probably not the case - that PR is about handling updates to nodes that cause constraints to no longer be met.
The issue is different, about service constraints, which are node role based.
See also https://github.com/docker/swarmkit/issues/1720, which has some related discussion about how updates should behave when they can't move forward.
Shouldn't update be the same. At least have some consistency.
Update
is different from initial service create. The reason is update
should try to protect service from failure. In the initial service create, there is nothing to protect.
Orchestrator is responsible to create X
tasks according to service specification. Scheduler tries to find nodes to host the tasks. If not any node can pass the constraints, the tasks are stuck at Allocated
state.
On the other hand, in service update, we don't want to kill your services. So if update encounters failures, it should pause or proceed base on FailureAction
configuration.
better to fail this update with timeout mechanism
Is a timeout provided for service create
and service update
? Did not find any.
service create
don't need timeout because it doesn't need to stop. I think service update
might need one.
~$ docker service inspect redis --pretty
ID: yg1ar0a3s1j71i3gd0fx869h7
Name: redis
Service Mode: Replicated
Replicas: 10
UpdateStatus:
State: updating
Started: 3 hours
Message: update in progress
Placement:Contraints: [node.role==manager node.role==worker]
UpdateConfig:
Parallelism: 1
On failure: pause
Max failure ratio: 0
ContainerSpec:
Image: redis:3.0.6@sha256:6a692a76c2081888b589e26e6ec835743119fe453d67ecf03df7de5b73d69842
Resources:
Networks: ovnet
Endpoint Mode: vip
~$ docker service ps redis | grep -i running
iogkv5q6628s redis.1 redis:3.0.6 ip-172-19-241-145 Running Running 3 hours ago
vnkxmb39gn6s redis.2 redis:3.0.6 ip-172-19-241-145 Running Running 3 hours ago
rv3nr66ctsr5 redis.3 redis:3.0.6 ip-172-19-241-145 Running Running 3 hours ago
9v4njcoy6tpr redis.4 redis:3.0.6 ip-172-19-147-51 Running Running 3 hours ago
vuqoq5grktbw redis.5 redis:3.0.6 Running Pending 3 hours ago
bszwzjx0492h redis.6 redis:3.0.6 ip-172-19-241-145 Running Running 3 hours ago
ztvbdpnf39k0 redis.7 redis:3.0.6 ip-172-19-147-51 Running Running 3 hours ago
siqa2jipigrf redis.8 redis:3.0.6 ip-172-19-241-145 Running Running 3 hours ago
fdzz67joia26 redis.9 redis:3.0.6 ip-172-19-147-51 Running Running 3 hours ago
0rapd9ghcl6d redis.10 redis:3.0.6 ip-172-19-147-51 Running Running 3 hours ago
Yes, thanks. update should probably include a timeout and a rollback option.
@aaronlehmann @aluzzardi What do you think of update that stuck at Pending
(previously Allocated
)? I think we should fail the update, either with number of retries or timeout. I think timeout is more generally applied and user friendly.
I think having an optional timeout is a good idea. In many cases, failing the update is the right thing to do, but in others the goal may be to wait for it to converge.
@dongluochen
Keeping in Allocated
at least for a while does have a purpose as replicas start Running
when the constraints are removed. If the replica is failed it won't be able to run when a constraint is modified. May be a timeout is a better option, but enough so that while constraints are being added/removed with the objective of eventually assigning the replicas and make them running, the replicas should not fail.
@dvohra When constraints are changed, it starts a new update which overwrites previous update.
@aaronlehmann Agree with the optional timeout setting.
@dongluochen The docker version I use is 19.03.8. Now I have a service that uses swarm to deploy 3 copies. The cluster has three nodes, one master and two workers. Each node is set with label, and the value of label is different. When I update the service (update image), I use the parameter -- constraint add 'node. Labels. XX = = XX'. My goal is to make one of the three nodes use the new image, so as to achieve the coexistence of new and old services. But when I do this, I find that all nodes use the new image, and the constraint is not X
我使用的docker版本是19.03.8。现在我有一个服务使用swarm部署副本数为3。集群有3个节点,一个master,两个worker。每个节点都设置了label,label的值各不相同。当我更新服务时(更新镜像)使用--constraint-add 'node.labels.xx==xx'的参数。我的目的是,想让三个节点中的指定一个节点使用新镜像,以达到新老服务共存的目的。但是当我这么做的时候,发现所有的节点都使用了新镜像,约束没有生效