swarmkit docker service update constraint-add <svc> issue

trafficstars

With a Swarm cluster consisting of a master and 2 worker nodes constraints added to place a service replicas on both manager and worker get added. How could a service replica be allocated to a node if constraints are added to place only on manager and only on worker. As an example:

Start with 2 replicas for a service 'mysql' without any constraints. Replicas are allocated without any constraint.

core@ip-10-0-0-238 ~ $ docker service ps -f desired-state=running mysql
ID                         NAME     IMAGE         NODE                        DESIRED STATE  CURRENT STATE          ERROR
bd4aw5mxijamjk94emr7uokd0  mysql.1  mysql:latest  ip-10-0-0-58.ec2.internal   Running        Running 8 minutes ago  
aq92faql778zbepzl7gldktne  mysql.2  mysql:latest  ip-10-0-0-140.ec2.internal  Running        Running 8 minutes ago

Scale service to 3 replicas.
Add constraint to only place replicas on 'manager'. core@ip-10-0-0-238 ~ $ docker service update --constraint-add 'node.role==manager' mysql ` All service replicas get placed on 'manager'.

core@ip-10-0-0-238 ~ $ docker service ps -f desired-state=running mysql
ID                         NAME     IMAGE         NODE                        DESIRED STATE  CURRENT STATE           ERROR
a8sxircmcl0068owwb14719yu  mysql.1  mysql:latest  ip-10-0-0-238.ec2.internal  Running        Running 10 seconds ago  
al4cnixheuy7ww07w2b7hudfc  mysql.2  mysql:latest  ip-10-0-0-238.ec2.internal  Running        Running 36 seconds ago  
8y3lm96begonntdr2of2104kl  mysql.3  mysql:latest  ip-10-0-0-238.ec2.internal  Running        Running 23 seconds ago

Scale to 10 replicas and all replicas are on 'manager' node as expected.

core@ip-10-0-0-238 ~ $ docker service ps -f desired-state=running mysql
ID                         NAME      IMAGE         NODE                        DESIRED STATE  CURRENT STATE           ERROR
a8sxircmcl0068owwb14719yu  mysql.1   mysql:latest  ip-10-0-0-238.ec2.internal  Running        Running 2 minutes ago   
al4cnixheuy7ww07w2b7hudfc  mysql.2   mysql:latest  ip-10-0-0-238.ec2.internal  Running        Running 2 minutes ago   
8y3lm96begonntdr2of2104kl  mysql.3   mysql:latest  ip-10-0-0-238.ec2.internal  Running        Running 2 minutes ago   
8wvcmcap3ra2f6fd7lk9w29nb  mysql.6   mysql:latest  ip-10-0-0-238.ec2.internal  Running        Running 20 seconds ago  
4m7bzl1ra6km6mabb8bhm4e1t  mysql.7   mysql:latest  ip-10-0-0-238.ec2.internal  Running        Running 12 seconds ago  
56dlt2jmhi91cc5kum4f9thwi  mysql.8   mysql:latest  ip-10-0-0-238.ec2.internal  Running        Running 12 seconds ago  
6ha3b20l2dlufk659htbiwqas  mysql.9   mysql:latest  ip-10-0-0-238.ec2.internal  Running        Running 12 seconds ago  
c4ddz2aw9jfued1665zou312m  mysql.10  mysql:latest  ip-10-0-0-238.ec2.internal  Running        Running 21 seconds ago

Add constraint to place service replicas on worker.

docker service update --constraint-add 'node.role==worker' mysql

The result should be that no replica should be running as a replica cannot be both on the 'manager' and 'worker' nodes. But the result is that some of the replicas are listed as "Allocated" but without any node on which placed and some of the replicas are still running on 'manager'.

core@ip-10-0-0-238 ~ $ docker service ps -f desired-state=running mysql
ID                         NAME      IMAGE         NODE                        DESIRED STATE  CURRENT STATE            ERROR
2vrcbtpn5bz3r86rlj2gneffm  mysql.1   mysql:latest                              Running        Allocated 3 minutes ago  
al4cnixheuy7ww07w2b7hudfc  mysql.2   mysql:latest  ip-10-0-0-238.ec2.internal  Running        Running 6 minutes ago    
8y3lm96begonntdr2of2104kl  mysql.3   mysql:latest  ip-10-0-0-238.ec2.internal  Running        Running 6 minutes ago    
3u8yt7oqe6kgxie3pjx54bdgr  mysql.4   mysql:latest                              Running        Allocated 3 minutes ago  
5e1okgodw0zktn2hilmhr5qa4  mysql.5   mysql:latest                              Running        Allocated 3 minutes ago  
47xgfvmzcskomx67n0ji1wfor  mysql.6   mysql:latest                              Running        Allocated 3 minutes ago  
7ziykb2o0p73hdxc8ie4pu8e2  mysql.7   mysql:latest                              Running        Allocated 3 minutes ago  
1to2xdaw3zv6qlr2j60fn1s76  mysql.8   mysql:latest                              Running        Allocated 2 minutes ago  
cijq7pg3l6kvvp4mdcmuc0ci5  mysql.9   mysql:latest                              Running        Allocated 2 minutes ago  
a94v71crwlv60mt6giocg0tv9  mysql.10  mysql:latest                              Running        Allocated 3 minutes ago  
core@ip-10-0-0-238 ~ $

Assuming the other two replicas would also have shutdown and listed as 'Allocated" after a few more minutes, if both the constraints are removed, all replicas get placed and are Running distributed across the nodes in the cluster.


core@ip-10-0-0-238 ~ $ docker service ps -f desired-state=running mysql
ID                         NAME      IMAGE         NODE                        DESIRED STATE  CURRENT STATE           ERROR
34rflltueyekp4httotm7x5ur  mysql.1   mysql:latest  ip-10-0-0-140.ec2.internal  Running        Running 7 minutes ago   
7defu1n1vgkp9rwzogwjwqvpf  mysql.2   mysql:latest  ip-10-0-0-238.ec2.internal  Running        Running 44 seconds ago  
58mzqk4zof6x77jsa7v8h23w4  mysql.3   mysql:latest  ip-10-0-0-140.ec2.internal  Running        Running 2 minutes ago   
da5y55an9la5ah0l2ounimh2n  mysql.4   mysql:latest  ip-10-0-0-140.ec2.internal  Running        Running 5 minutes ago   
clao5lkper1faksuo8uk24dje  mysql.5   mysql:latest  ip-10-0-0-58.ec2.internal   Running        Running 3 minutes ago   
dk6sc736kprvphexbmgpqjeso  mysql.6   mysql:latest  ip-10-0-0-58.ec2.internal   Running        Running 4 minutes ago   
ecnycghcn0tbj8mn1252tvrrz  mysql.7   mysql:latest  ip-10-0-0-238.ec2.internal  Running        Running 2 minutes ago   
0bbsfgxzkmrh23avr6o35s4qz  mysql.8   mysql:latest  ip-10-0-0-58.ec2.internal   Running        Running 8 minutes ago   
bpgkc3fxsl4ecm9ezq9q17aii  mysql.10  mysql:latest  ip-10-0-0-238.ec2.internal  Running        Running 58 seconds ago

On which node are the service replicas "Allocated" as listed but not Running if the only nodes are the manager and the worker role nodes?

Feb 01 '17 19:02 Deepak-Vohra

On which node are the service replicas "Allocated" as listed but not Running

@dvohra when a task from a replicated service is in Allocated state, it has not been Assigned to any node. So the NODE field in docker service ps shows empty. These tasks are in the hand of manager, not at any node.

What's your UpdateConfig from docker service inspect mysql --pretty? I see the update continues when the tasks stuck at Allocated. It seems you allow update to continue disregard of update is success or not.

Feb 01 '17 20:02 dongluochen

With both constraints added the result is not consistent. Earlier all but 2 of the 10 replicas stopped running within 5 minutes. When tested again, all but two replicas are running even after more than 10 minutes, started with 10 replicas.


core@ip-10-0-0-238 ~ $ docker service ps -f desired-state=running mysql
ID                         NAME      IMAGE         NODE                        DESIRED STATE  CURRENT STATE              ERROR
314xi0et7v7q306dxa2dav8kr  mysql.1   mysql:latest                              Running        Allocated 9 minutes ago    
4gtdsi8n5lb42tzedm1ltiyb6  mysql.2   mysql:latest                              Running        Allocated 8 minutes ago    
csjvs4ci8km86fewrq723nkf3  mysql.3   mysql:latest  ip-10-0-0-140.ec2.internal  Running        Running about an hour ago  
da5y55an9la5ah0l2ounimh2n  mysql.4   mysql:latest  ip-10-0-0-140.ec2.internal  Running        Running 2 hours ago        
6g4v90aoo9r0itzzu7b7nvoyt  mysql.5   mysql:latest  ip-10-0-0-238.ec2.internal  Running        Running about an hour ago  
dk6sc736kprvphexbmgpqjeso  mysql.6   mysql:latest  ip-10-0-0-58.ec2.internal   Running        Running about an hour ago  
ecnycghcn0tbj8mn1252tvrrz  mysql.7   mysql:latest  ip-10-0-0-238.ec2.internal  Running        Running about an hour ago  
bsxi3pqwhnjcdkofkgtqmif69  mysql.8   mysql:latest  ip-10-0-0-58.ec2.internal   Running        Running about an hour ago  
e07fajyd1ogqvin1o8cg1v9tq  mysql.9   mysql:latest  ip-10-0-0-58.ec2.internal   Running        Running about an hour ago  
7ub4vw46afmgn5suvm9wcma6y  mysql.10  mysql:latest  ip-10-0-0-238.ec2.internal  Running        Running about an hour ago  
core@ip-10-0-0-238 ~ $ docker service inspect mysql
[
    {
        "ID": "2jvr9t2tl7ovxa6o5zt7jhh8m",
        "Version": {
            "Index": 2320
        },
        "CreatedAt": "2017-02-01T18:10:28.490900302Z",
        "UpdatedAt": "2017-02-01T21:30:48.294171833Z",
        "Spec": {
            "Name": "mysql",
            "TaskTemplate": {
                "ContainerSpec": {
                    "Image": "mysql:latest",
                    "Env": [
                        "MYSQL_ROOT_PASSWORD=mysql"
                    ]
                },
                "Resources": {
                    "Limits": {},
                    "Reservations": {}
                },
                "RestartPolicy": {
                    "Condition": "any",
                    "MaxAttempts": 0
                },
                "Placement": {
                    "Constraints": [
                        "node.role==worker",
                        "node.role==manager"
                    ]
                }
            },
            "Mode": {
                "Replicated": {
                    "Replicas": 10
                }
            },
            "UpdateConfig": {
                "Parallelism": 1,
                "Delay": 10000000000,
                "FailureAction": "pause"
            },
            "EndpointSpec": {
                "Mode": "vip"
            }
        },
        "Endpoint": {
            "Spec": {}
        },
        "UpdateStatus": {
            "State": "updating",
            "StartedAt": "2017-02-01T21:30:48.294165833Z",
            "CompletedAt": "1970-01-01T00:00:00Z",
            "Message": "update in progress"
        }
    }
]
core@ip-10-0-0-238 ~ $

Feb 01 '17 21:02 Deepak-Vohra

Still, all but two replicas are running even after more than 20 minutes. How could replicas be running with the placement constraints not being applied?

Constraints": [
                        "node.role==worker",
                        "node.role==manager"
                    ]

Is some AND, OR used for placement with multiple constraints?

Feb 01 '17 21:02 Deepak-Vohra

It's because update has not finished.

        "UpdateStatus": {
            "State": "updating",
            "StartedAt": "2017-02-01T21:30:48.294165833Z",
            "CompletedAt": "1970-01-01T00:00:00Z",
            "Message": "update in progress"
        }

I think it better to fail this update with timeout mechanism instead of letting it stuck at "updating". What's docker version on your nodes?

cc @aaronlehmann.

Feb 01 '17 21:02 dongluochen

core@ip-10-0-0-238 ~ $ docker version
Client:
 Version:      1.12.6
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   d5236f0
 Built:        Tue Jan 31 07:56:17 2017
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.6
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   d5236f0
 Built:        Tue Jan 31 07:56:17 2017
 OS/Arch:      linux/amd64

How are multiple constraints applied?

Feb 01 '17 22:02 Deepak-Vohra

In my test on current Docker master (1.14.0-dev), the update only takes down Parallelism (default to 1) replicas, which protects the service from failing.

Multiple constraints apply as "&&". In your case it is "node.role==worker && node.role==manager". We might extend it later though.

Feb 01 '17 22:02 dongluochen

As role cannot be both manager and worker replicas should not be running with both roles applied as constraints.

Feb 01 '17 22:02 Deepak-Vohra

@dvohra that's correct. If you put these constraints on service create, no instance would advance to running.

Feb 01 '17 22:02 dongluochen

With Parallelism as 1 two replicas would be made non running. But in an earlier run all but two (8 out of 10) replicas were made non running.

Feb 01 '17 22:02 Deepak-Vohra

Shouldn't update be the same. At least have some consistency.

Feb 01 '17 22:02 Deepak-Vohra

If you put these constraints on service create, no instance would advance to running.

Got the result indicated. But stays Allocated. No provision to set timeout.

Feb 01 '17 22:02 Deepak-Vohra

I suspect this may be fixed in 1.13.0 by https://github.com/docker/swarmkit/pull/1612

Feb 01 '17 23:02 aaronlehmann

Actually that's probably not the case - that PR is about handling updates to nodes that cause constraints to no longer be met.

Feb 01 '17 23:02 aaronlehmann

The issue is different, about service constraints, which are node role based.

Feb 01 '17 23:02 Deepak-Vohra

See also https://github.com/docker/swarmkit/issues/1720, which has some related discussion about how updates should behave when they can't move forward.

Feb 01 '17 23:02 aaronlehmann

Shouldn't update be the same. At least have some consistency.

Update is different from initial service create. The reason is update should try to protect service from failure. In the initial service create, there is nothing to protect.

Orchestrator is responsible to create X tasks according to service specification. Scheduler tries to find nodes to host the tasks. If not any node can pass the constraints, the tasks are stuck at Allocated state.

On the other hand, in service update, we don't want to kill your services. So if update encounters failures, it should pause or proceed base on FailureAction configuration.

Feb 01 '17 23:02 dongluochen

better to fail this update with timeout mechanism

Is a timeout provided for service create and service update? Did not find any.

Feb 01 '17 23:02 Deepak-Vohra

service create don't need timeout because it doesn't need to stop. I think service update might need one.

~$ docker service inspect redis --pretty

ID:		yg1ar0a3s1j71i3gd0fx869h7
Name:		redis
Service Mode:	Replicated
 Replicas:	10
UpdateStatus:
 State:		updating
 Started:	3 hours
 Message:	update in progress
Placement:Contraints:	[node.role==manager node.role==worker]
UpdateConfig:
 Parallelism:	1
 On failure:	pause
 Max failure ratio: 0
ContainerSpec:
 Image:		redis:3.0.6@sha256:6a692a76c2081888b589e26e6ec835743119fe453d67ecf03df7de5b73d69842
Resources:
Networks: ovnet
Endpoint Mode:	vip

~$ docker service ps redis | grep -i running
iogkv5q6628s  redis.1      redis:3.0.6  ip-172-19-241-145          Running        Running 3 hours ago
vnkxmb39gn6s  redis.2      redis:3.0.6  ip-172-19-241-145          Running        Running 3 hours ago
rv3nr66ctsr5  redis.3      redis:3.0.6  ip-172-19-241-145          Running        Running 3 hours ago
9v4njcoy6tpr  redis.4      redis:3.0.6  ip-172-19-147-51           Running        Running 3 hours ago
vuqoq5grktbw  redis.5      redis:3.0.6                             Running        Pending 3 hours ago
bszwzjx0492h  redis.6      redis:3.0.6  ip-172-19-241-145          Running        Running 3 hours ago
ztvbdpnf39k0  redis.7      redis:3.0.6  ip-172-19-147-51           Running        Running 3 hours ago
siqa2jipigrf  redis.8      redis:3.0.6  ip-172-19-241-145          Running        Running 3 hours ago
fdzz67joia26  redis.9      redis:3.0.6  ip-172-19-147-51           Running        Running 3 hours ago
0rapd9ghcl6d  redis.10     redis:3.0.6  ip-172-19-147-51           Running        Running 3 hours ago

Feb 01 '17 23:02 dongluochen

Yes, thanks. update should probably include a timeout and a rollback option.

Feb 02 '17 00:02 Deepak-Vohra

@aaronlehmann @aluzzardi What do you think of update that stuck at Pending (previously Allocated)? I think we should fail the update, either with number of retries or timeout. I think timeout is more generally applied and user friendly.

Feb 03 '17 18:02 dongluochen

I think having an optional timeout is a good idea. In many cases, failing the update is the right thing to do, but in others the goal may be to wait for it to converge.

Feb 03 '17 18:02 aaronlehmann

@dongluochen

Keeping in Allocated at least for a while does have a purpose as replicas start Running when the constraints are removed. If the replica is failed it won't be able to run when a constraint is modified. May be a timeout is a better option, but enough so that while constraints are being added/removed with the objective of eventually assigning the replicas and make them running, the replicas should not fail.

Feb 03 '17 18:02 Deepak-Vohra

@dvohra When constraints are changed, it starts a new update which overwrites previous update.

@aaronlehmann Agree with the optional timeout setting.

Feb 03 '17 18:02 dongluochen

@dongluochen The docker version I use is 19.03.8. Now I have a service that uses swarm to deploy 3 copies. The cluster has three nodes, one master and two workers. Each node is set with label, and the value of label is different. When I update the service (update image), I use the parameter -- constraint add 'node. Labels. XX = = XX'. My goal is to make one of the three nodes use the new image, so as to achieve the coexistence of new and old services. But when I do this, I find that all nodes use the new image, and the constraint is not X

我使用的docker版本是19.03.8。现在我有一个服务使用swarm部署副本数为3。集群有3个节点，一个master，两个worker。每个节点都设置了label，label的值各不相同。当我更新服务时(更新镜像)使用--constraint-add 'node.labels.xx==xx'的参数。我的目的是，想让三个节点中的指定一个节点使用新镜像，以达到新老服务共存的目的。但是当我这么做的时候，发现所有的节点都使用了新镜像，约束没有生效

May 25 '21 09:05 chenwuwen

swarmkit swarmkit copied to clipboard

docker service update constraint-add <svc> issue

swarmkit
swarmkit copied to clipboard