yet-another-docker-plugin
yet-another-docker-plugin copied to clipboard
Support for swarm mode in 1.12?
Do you think it is possible -and likely- to support swarm mode coming with Docker 1.12 in this plugin?
Checking the remote api 1.24 for services, I'd say that it would be entirely possible to create a service with a single task.
It would be great to offer possible features of swarm in the cloud / image configuration. For example:
- Resource reservation, resource limit per service/image
- Placement constraints per service/image, maybe even on a job basis (I have no clue if this is possible in jenkins )
Yes, anything! But i stuck in docker-java upstream with integration tests :(
My usual process is update docker-java with APIs, then define how it should work in plugin and implement.
Could you provide ideas how configuration should look/work in jenkins?
Having configuration on Job basis would be very useful, that should be very simple like in https://github.com/jenkinsci/docker-plugin/pull/383
I skimmed through the documentation of the remote API. Most of the current configuration won't need to be changed. The changes that would likely be needed:
Cloud configuration:
- Mode setting (swarm mode 1.12 or single host mode [or swarm standalone])
- Master: Possibly: Specify more than one master for a single cloud? (Not sure if needed in the plugin itself)
Image configuration
- Privileged Flag not supported yet, needs to be disabled (See also this comment )
- Add fields for cpu, memory reservation
- Add fields for cpu, memory limit
- Add field for placement constraints
I could imagine the following override settings for the job-based configuration, but I'd think it would be good to think this over first:
- Override cpu, memory reservation/limit
- Override argument, and possibly command to execute
Also, from what I've read in the documentation, exposing ports might be more difficult than in the current version: From what I see, they need to be exposed explicitly (no nice "Publish All") flag. So this would require some random generator and possibly retry if it conflicts with an existing service.
As an inspiration, the Kubernetes Plugin looks very similar to the likely solution:

As an inspiration, the Kubernetes Plugin looks very similar to the likely solution:
That looks the same :/
Master: Possibly: Specify more than one master for a single cloud? (Not sure if needed in the plugin itself)
It may make sense when DockerClient will throw exception, but net-split issue will be under question.
Image configuration
Are they are the same as for standard docker? I can sync API to latest create/stop/remove features.
Override cpu, memory reservation/limit Override argument, and possibly command to execute
It could be possible extend JobProperty #72 in future and require contstraints.
As an inspiration, the Kubernetes Plugin looks very similar to the likely solution:
They has only reservation/etc limits, and it will be solved with syncing create command to latest features as soon as https://github.com/docker-java/docker-java/pull/673 will be added in docker-java.
@padyx can standard docker client work with docker engine that in swarm mode?
can standard docker client work with docker engine that in swarm mode?
Yes, but any containers started via the regular container API (/containers/create) will be created on that specific host - and not in the swarm. So we do need to call a different API if swarm mode is selected.
Image configuration
Not all of the features that we currently can configure for regular containers are possible for swarm mode. Priviledged and SHM-Size are two of the features that don't work with Docker 1.12 in swarm mode. I haven't made a full comparison yet.
I've seen a few related and promising looking pull requests over at docker-java (docker-java/docker-java#686, docker-java/docker-java#678, docker-java/docker-java#673). How long - if at all - do you think it will take to take advantage of those in this plugin? Is it on anyone's priority list?
@avandorp unfortunately i do both projects in my free time, in docker-java they stuck because of integration tests. In docker-plugin it bit unclear how better design classes (i can add additional checkbox in Cloud and Template or subclass classes according to architecture design).
@KostyaSha Can we assist you in some way to get this moving? Help fix integration tests in docker-java, help with architecture sketches here, or something else?
Yes, sure. But swarm mode is not needed for this plugin as jenkins should have exact mapping. I think swarm cli is the only useful thing for orchestration.
But i may mistake... open for discussion.
But swarm mode is not needed for this plugin as jenkins should have exact mapping. I think swarm cli is the only useful thing for orchestration.
Could you elaborate on what you mean with this? I don't quite follow.
One of the swarm-mode features is to have scaling, but with jenkins you can't do it without pre-creating Cloud objects on jenkins side. So you it would like create a lot of single services for every job that looks weird.
I see - my assumption for a possible solution was to use the Docker Remote API for services and to adapt the plugin to:
- For each starting job (cloud node provision): Spawn a swarm service with 1 task (the jenkins slave task)
- For each terminating job (cloud node unprovision): Stop and remove the swarm service
This would lead to creating and destroying services without taking advantage of scaling.
Is this what you thought, or do you see another option to support running jenkins jobs on Docker Swarms (with Swarm mode)? Or would you have increased the scaling of the Swarm service and connected to the "free" task created by the scaling?
So you will have a lot of similar services?
Or would you have increased the scaling of the Swarm service and connected to the "free" task created by the scaling?
It may be possible if docker will listen events, but it would be too difficult i think. In any case we can create experimental provisionings and try!
So you will have a lot of similar services?
Yes, we'd have a lot of similar services if implemented that way, because we'd not care about scaling.
We checked the documentation and experimented with the remote API and our conclusions are:
- Using a single service per Docker image would not work:
- The swarm does not return the id of the created task from the
/services/.../updateendpoint, leaving us with no option to identify which task was just created. Unless only a single jenkins were to control the swarm, then theoretically we could compare the task list before/after the operation to identify the new task. But that would be a very unstable implementation
- The swarm does not return the id of the created task from the
- Using a single service per job run seems to work:
- Use a GET request to
/servicesto list all services, and identify already reserved ports - Generate random ports in the ephemeral range for all ports that need bindings
- Use POST request to
/services/createto start a single service (replicas=1) - (If necessary and the port got taken in the meantime, repeat the steps above)
- Use a GET request to
/services/<serviceid>/to identify the created task and connect to it - After job completes: Use a DELETE request to
/services/<serviceid>to remove the service
- Use a GET request to
We'd suggest the "single service per job" implementation. What is your opinion?
We'd suggest the "single service per job" implementation. What is your opinion?
Looks similar to existing logic. Now the question will be how code could be refactored... and how generic swarm could fit...
Without knowing at all how the plugin is structured today - that sounds like a strategy pattern. There'd be one strategy for normal use and one strategy for the swarm, depending which configuration was chosen.
+1 supporting swarm mode would be great!
Small note, near this topic. Thinking how better implement 2 level provisioning in jenkins.
Swarm mode itself is not suitable for jenkins. Classical swarm is the best choice. It will expose api that could be used for balanced slave containers runs and building images. Swarm mode is mostly for app runs: run X containers, restart them. That's all isn't possible for jenkins builds.
@KostyaSha From jenkins build perspective we can spin new container for build with replica = "1" always. Then internally swarm mode will load balance and spin container in some host. For people using swarm mode already, they have to do another classical swarm setup just for jenkins builds. This would be a overhead of maintaining two clusters. Supporting swarm-mode would be very helpful.
+1
I also would like to see swarm mode support. I have raised a separate issue talking about how connecting the Cloud URL to a load balancer fails catastrophically apparently because the launched container cannot be located on subsequent calls after create (because the load balancer redirects the request to different nodes). My next thought was, maybe I could connect the Cloud URL to a swarm master since it's internal service discovery knows where all the containers that relate to a service exist (in this case there would only ever be one). But of course YADP needs to support the API calls to create swarm services rather than simple docker containers I suspect.
In an enterprise setting not being able to scale to use multiple hosts associated to a single YADP Cloud is a significant problem. Sure we could have multiple Clouds but that doesn't really equal scalability and you are still left with a single point of failure of your singleton host.
@goffinf In the meantime you can switch to docker swarm, that keeps the docker API (non-service based) while keeping a clusterized docker installation. There wouldnt be any need for load balancers, as docker swarm already does it.
@witokondoria, thx for your comment. I might try that, although I am somewhat reluctant to use what is essentially a deprecated product.
I would probably keep the ELB since it allows the use of a CName (R53 recordset alias) and would abstract the physical IP of the swarm master.
@padyx @KostyaSha @adityacs What do you think the propspects are for supporting swarm mode in YADP (in the constrained way outlined in this issue - single job per service) in the near term ?
Certainly in the corporate space, everyone I come across is using a scheduler of one type or another and therefore needs to leverage the service abstraction (nothing says a service can't be a single container stack). So whilst scalability (and resilience) won't necessarily be achieved by starting multiple containers, being able to schedule individual Jenkins slave containers across a cluster of managed nodes still represents a significant improvement from the single point of failure that is the current situation.
This isn't a criticism of the work to-date which I'm sure we all appreciate very much, but I am certainly having a tough time persuading architects and solution designers where I work of the elegance of ephemeral slaves when they discover this limitation.
As @padyx I am more than happy to contribute in any way I can, maintaining multiple projects with lots of people asking for change and trying to separate the high priorities from the nice to haves can be a lonely place :-)
Kind Regards
Fraser.
YADP uses https://github.com/docker-java/docker-java client for all docker operations. From the changelog(https://github.com/docker-java/docker-java/blob/master/CHANGELOG.md) I see that swarm-mode is yet not officially supported in docker-java client.
@goffinf The comment of adityacs is correct, that first there would need to be an implementation of Swarm APIs in docker-java. Another java api would be the https://github.com/spotify/docker-client which also offers a Java API and already supports Swarm APIs.
The changes themselves are likely not that big - refactoring the plugin to use different strategies would be required though. I have a very rough proof-of-concept Jenkins plugin using the Spotify APIs that successfully launches a service, executes a job and kills the service. (Currently not open sourced)
The major question for me is for @KostyaSha : Since this is your repository (and plugin), would you consider such a Swarm mode at all? If not, we'd probably have to create another plugin.
@padyx I am also in a situation that requires ephemeral build slaves launched across a Docker Swarm Cluster via short lived services. Ultimately, if I cannot find a solution to allow this functionality then I was going to roll my own plugin.
To restate the sentiment that has been expressed here multiple times, I absolutely appreciate what @KostyaSha has done with this plugin. I also understand the time involved to maintain and add features can be quite challenging, especially with multiple endeavors such as career and family taking their toll. So by all means I am not complaining in the slightest and absolutely understand the situation.
Rather, I'd like to figure out a plan like everyone else so that the future state allows for Jenkins to use modern Docker Swarm for Ephemeral Slaves. I know many of us would be more than eager to contribute directly to this project to allow for this capability.
I think it's important to get a definitive answer for when or if this plugin will ultimately support what we are after. If it's not in the cards or may be much longer down the road than we desire, then we either get approval to contribute to this plugin or perhaps band together to create a fork or a new project.
Since this feature is so important to many I could see it adding so much value that it'd be very popular. We all win as a community if we can band together and make this happen. Setting a plan in motion is the next step and I'd be happy to get involved.