docker-workflow-plugin
docker-workflow-plugin copied to clipboard
Remove user from docker run command
Fixes #25 , removes the user from the docker run command. The changes from #25 were merged in and tested against the demo container. I also updated the Dockerfile in the demo in preparation of the release to support the new required minimum version (Docker client 1.7.0). During testing, I found a bug in master with docker client 1.7.0, the bug was introduced in this commit. The bug is that the redirection is causing the exit status to be 1, when it should be 0 in some instances.
To workaround that, I removed the output redirection which restored previous behavior. See the full commit message for the result of each command. Here is the result of testing the return value of the command (inside the demo container):
root@8a07680574ce:/tmp/files# docker --version
Docker version 1.7.0, build 0baf609
root@8a07680574ce:/tmp/files# docker inspect -f . maven:3.3.9-jdk-8
Error: No such image or container: maven:3.3.9-jdk-8
root@8a07680574ce:/tmp/files# echo $?
1
root@8a07680574ce:/tmp/files# docker inspect -f . maven:3.3.9-jdk-8 >&- 2>&-
root@8a07680574ce:/tmp/files# echo $?
1
# run the job to create the container
root@8a07680574ce:/tmp/files# docker inspect -f . maven:3.3.9-jdk-8
.
root@8a07680574ce:/tmp/files# echo $?
0
root@8a07680574ce:/tmp/files# docker inspect -f . maven:3.3.9-jdk-8 >&- 2>&-
root@8a07680574ce:/tmp/files# echo $?
1
root@8a07680574ce:/tmp/files#
This is the error that appeared in the Jenkins job:
[workspace] Running shell script
+ docker pull maven:3.3.9-jdk-8
3.3.9-jdk-8: Pulling from library/maven
Digest: sha256:a5283092002ec9739819c9b392203103346dfc163586ff46c30ead13b60cc436
Status: Image is up to date for maven:3.3.9-jdk-8
[Pipeline] withEnv
[Pipeline] {
[Pipeline] withDockerRegistry
[Pipeline] {
[Pipeline] stage (Build)
Entering stage Build
Proceeding
[Pipeline] sh
[workspace] Running shell script
+ docker inspect -f . maven:3.3.9-jdk-8
[Pipeline] sh
[workspace] Running shell script
+ docker pull localhost/maven:3.3.9-jdk-8
Pulling repository localhost/maven
Error: image maven not found
[Pipeline] }
[Pipeline] // withDockerRegistry
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // node
[Pipeline] End of Pipeline
ERROR: script returned exit code 1
Finished: FAILURE
Not sure what the process is here for notifying people of pull requests, so apologies in advance if this isn't the right mechanism.
@reviewbybees
cc: @ndeloof @jglick @svanoort
This pull request originates from a CloudBees employee. At CloudBees, we require that all pull requests be reviewed by other CloudBees employees before we seek to have the change accepted. If you want to learn more about our process please see this explanation.
@jglick @ndeloof @svanoort @reviewbybees ping
Compared to #25 I can't see user set when docker exec is ran, which will cause issues with file permissions on workspace.
Good catch, I'll get that added.
@ndeloof i added the --user change. i had some troubles verifying that it is actually doing what I expected (I couldn't directly see the docker exec command) but it seems to pass in the demo container and against some of my existing jenkins jobs that use the plugin.
I performed an 'id' in the sh step and it was indeed the jenkins user (in my test instance, in the demo container the jenkins user is root). I tried to figure out how to make the 'docker exec' command it was running print out to the console log like the docker run command but was unsuccessful at figuring out how to do that.
This may be a long-term thing (and probably a mailing list discussion), but I would like to see/determine if the core logic of this plugin and the Docker Custom Build Environment plugin can be merged so the results are consistent between the two. I currently use both and the behavior between these two are similar but different enough to make it confusing.
LGTM
@antoniobeyah I would have expected the docker custom build environment plugin to provide .inside extension do docker-pipeline, and avoid such code duplication :-
anyway, imho docker+jenkins future is docker-slaves-plugin.
@ndeloof thanks for the review, are there any additional required reviewers or can you merge it? I don't have merge rights or else I would based on your approval. I just want to get it off my plate as one less thing to check-in/monitor status of before someone else introduces some conflicting changes to master. ;)
@antoniobeyah need to wait for plugin maintainer (@jglick)
So is this purporting to address JENKINS-34289 or not? And is there a self-contained demo case that can be used to prove that this change fixes some concrete bug? What I am missing from all this is a clear description of what problem is being solved.
I found a bug in master with docker client 1.7.0, the bug was introduced in #54. The bug is that the redirection is causing the exit status to be 1, when it should be 0 in some instances.
To workaround that, I removed the output redirection which restored previous behavior.
This sounds like a separate issue that would better be fixed in its own pull request. (If basic testing of this PR requires that fix, just merge in that branch.)
I performed an 'id' in the sh step and it was indeed the jenkins user (in my test instance, in the demo container the jenkins user is root).
This is an example of something you have apparently tested interactively but which is not reflected in an automated test, or even written manual test instructions.
I would like to see/determine if the core logic of this plugin and the Docker Custom Build Environment plugin can be merged
Given how fundamentally different Pipeline asynchronous (durable) process execution is from freestyle, I do not think this is practical.
@jglick both rely on docker exec a very comparable way, I don't see any reason they don't share a common code base.
@jglick JENKINS-34289 was marked as resolved so I ignored it for the purposes of this PR. Point taken on the clear explanation on why this is needed, I will try to get a test case written up that shows the failure.
I have a limited amount of time I can spend on this- we are running on a forked version at Target that has the changes in it (not ideal). Are you able to provide a bulleted list of items that you require for a merge? So far I have gathered:
- Move the master bug fix to a seperate pr
- Provide a test showing the thing this PR is intended to fix
- (breaks backwards compability) Remove the 'user' parameter from DockerClient#run
- Move dockerClient.whoAmI() to occur before the invocation
Are there any additional changes you would require?
Move the master bug fix to a seperate pr
Well, desirable practice generally, so that the obvious bug fix can get to users while we are discussing the main part of this PR. Can be skipped if you do not have time.
Provide a test showing the thing this PR is intended to fix
Yes, manual test steps if it is impractical to write an automated test (for example because you need to be using a specific user ID on the host).
breaks backwards compability
It is fine, DockerClient is really just an internal utility for this plugin.
Move dockerClient.whoAmI() to occur before the invocation
Right, safer.
Are there any additional changes you would require?
No, the test—or just some demonstration that this is fixing a particular bug I can see for myself—is the main thing.
I do not understand why removing the user constraints put on a container for docker.inside is under consideration. The docker.inside mechanism is for injecting a workspace into a container so that it may act upon it. If a container cannot run as the specified user then it's image is simply not fit for this purpose. After injecting the workspace into the container to mutate it I expect that follow-on processes will be able to further mutate it as will. Choosing to forgo the specification of the user that the container should run as breaks this very fundamental expectation.
@carlossg pointed me here from carlossg/docker-maven#16, and yes this would fix the problem but at the cost of a good deal more downstream breakage. What this PR purports to fix, that being #25, seems of dubious value to me. I think it would make more sense to leave the default behavior of specifying the user as-is unless the image specified for docker.inside is specially packaged, say with labels, to support advanced functionality. This could be as simple as having a label such as io.jenkins.pipeline.docker.inside.user=root on the container. This would not, of course, cover the desired contract that the resulting workspace can still be written to by the invoking user but given that docker.inside is already docker-aware I imagine it could perform some nifty post processing (or possibly tricksy user namespace remapping) to have the output workspace owned and writable by the invoking user, as would be expected.
/cc @ndeloof @jglick
@dweomer The constraint isn't being removed, it is being moved to the exec command. This will allow the container to start up as expected, without unintended side effects.
I've been slow on getting this pr updated with the requests, but this is our use case:
We use docker images for CI testing of RPM builds. The RPMs happen to use systemd. That means entrypoint is 'init' which must be run as root, otherwise the container blows up. All subsequent commands are executed as the jenkins user, and that is supported by 'docker exec --user
This behavior is already supported by other plugins, such as Docker Custom Build Environment plugin. We are running a forked version of the docker-pipeline plugin without the docker run --user constraint and it is working great.
This PR only allows for the possibility of using additional containers without the need of custom tailoring them to be used within Jenkins. I would also be for using a different mechanism for determining the user to run .inside as, but as it stands it works 'well enough' with the changes in this PR.
@dweomer Will this PR break your environment somehow? If so I may be able to add in some checks to make that less likely, just need some specific details on your concerns.
@jglick As far as the automated test- I don't think I will be able to write one due to the reasons you provided (the user in the host machine is root, my test requires it to be a non-root user). Can you provide more details on how specifically I can provide manual verification steps?
Can you provide more details on how specifically I can provide manual verification steps?
Well, something I can run locally (perhaps inside VirtualBox if host permissions matter) that would let me understand the problem for myself and evaluate how the proposed fix works. Minimally, a demo Pipeline script relying only on images on Docker Hub (can use .build as needed to create derivative images). If I can reproduce it manually, I at least have a shot at adding an automated test as well.
Will this PR break your environment somehow? If so I may be able to add in some checks to make that less likely
If there is any concern about this, I would rather make the new behavior opt-in.
@antoniobeyah wrote:
Will this PR break your environment somehow? If so I may be able to add in some checks to make that less likely, just need some specific details on your concerns.
In my ire related to carlossg/docker-maven#16, I had forgotten about the run/exec distinction. This should not break environments but I feel that you should ensure that by implementing a test case such as:
- pull down well known or sample project
- build it via
docker(img).insideas appropriate (for each of golang, maven, and possibly node/npm?) - wipe out the entire workspace without encountering write permission failures
@antoniobeyah @jglick:
My other qualm with this is that I do not want my build containers to run as root unless I explicitly tell them to via code (as that implies pipeline code which is something that can be audited). Currently most official images for build systems such as golang and maven run as root by default (which is correct because these are very low level base images) and this is currently overriden by docker.inside by default to match the user:group of the process that invokes the in-container build.
In short, defaulting to the image's USER introduces a potential security hole as a side-effect which is not acceptable. Opt-in semantics would address this issue.
@antoniobeyah after re-reading your use-case I don't see that you need make any code changes at all, you just need to specify the user a la docker.image('rpm-with-systemd-tester').inside('--user root:root'). The below example works on both a Jenkins 1.x LTS as well as 2.x LTS instance:
timestamps { -> node {
stage ('DEFAULT') {
docker.image('maven:3.2').inside('--group-add 999') {
sh 'id'
}
}
stage ('ROOT') {
docker.image('maven:3.2').inside('--user root:root') {
sh 'id'
}
}
}}
with output that looks like:
Started by user admin
[Pipeline] timestamps
[Pipeline] {
[Pipeline] node
00:38:05 Running on master in /var/jenkins_home/workspace/maven-3-point-2
[Pipeline] {
[Pipeline] stage
[Pipeline] { (DEFAULT)
[Pipeline] sh
00:38:05 [maven-3-point-2] Running shell script
00:38:05 + docker inspect -f . maven:3.2
00:38:05 .
[Pipeline] withDockerContainer
00:38:05 $ docker run -t -d -u 1000:1000 --group-add 999 -w /var/jenkins_home/workspace/maven-3-point-2 --volumes-from dc8e94e8e3eaa302a91ebb2458fd093ffc9287e3992ee37cbe2e5392361b5532 -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** --entrypoint cat maven:3.2
[Pipeline] {
[Pipeline] sh
00:38:06 [maven-3-point-2] Running shell script
00:38:07 + id
00:38:07 uid=1000 gid=1000 groups=1000,999
[Pipeline] }
00:38:07 $ docker stop 2e1c73c55e2d4b73461f40ba14f4df7fdeece2ecb0075f1ce546cbc6e261ad28
00:38:17 $ docker rm -f 2e1c73c55e2d4b73461f40ba14f4df7fdeece2ecb0075f1ce546cbc6e261ad28
[Pipeline] // withDockerContainer
[Pipeline] }
[Pipeline] // stage
[Pipeline] stage
[Pipeline] { (ROOT)
[Pipeline] sh
00:38:17 [maven-3-point-2] Running shell script
00:38:18 + docker inspect -f . maven:3.2
00:38:18 .
[Pipeline] withDockerContainer
00:38:18 $ docker run -t -d -u 1000:1000 --user root:root -w /var/jenkins_home/workspace/maven-3-point-2 --volumes-from dc8e94e8e3eaa302a91ebb2458fd093ffc9287e3992ee37cbe2e5392361b5532 -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** --entrypoint cat maven:3.2
[Pipeline] {
[Pipeline] sh
00:38:19 [maven-3-point-2] Running shell script
00:38:19 + id
00:38:19 uid=0(root) gid=0(root) groups=0(root)
[Pipeline] }
00:38:19 $ docker stop 34e4441d70587203e1b0ad047e4c6363b33588d36f9e7535eb00200255ecf54c
00:38:30 $ docker rm -f 34e4441d70587203e1b0ad047e4c6363b33588d36f9e7535eb00200255ecf54c
[Pipeline] // withDockerContainer
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
[Pipeline] // node
[Pipeline] }
[Pipeline] // timestamps
[Pipeline] End of Pipeline
Finished: SUCCESS
The trick above is that docker run seems to be forgiving of multiple --user specifications and the last one in wins. I don't run RPM based images/systems and so if you are using RedHat's docker fork this may not be supported but it seems worth a try.
you just need to specify the user a la
docker.image('rpm-with-systemd-tester').inside('--user root:root')
Interesting, though it would be nicer to have this be an advertised option.
JENKINS-38438 was recently filed, which sounds like it is on the same topic, yet switching the --user from docker-run to docker-exec does not make any difference in that case: either way, the command fails.
Some containers come with pre set users, etc. The automatic adding of volumes and users can be disruptive and hard for users to troubleshoot.
Can we split a plain docker environment out from the mount everything as a workspace version?
Or should everyone fall back to sh 'docker...'?
@jglick I have rebased the branch and added in test cases to demonstrate the use case for this pr. I also moved the dockerClient.whoAmI() to occur before the container run. I didn't add any guards to it because whoAmI will throw an exception on error. I ran the tests, but they fail and they are all appear to be setup as 'skip' when run through maven so I am assuming there is some trick to getting them to pass. The error I get is below. The latest entrypoint changes on master also cause an issue. The start command (entrypoint) for my container must be '/usr/sbin/init', but that is currently overridden by 'cat' due to entrypoint positioning. I am looking to see if there is anything I can do to prevent that. I will likely move the default stuff to happen before user defined stuff.
The test cases don't exactly illustrate whats happening below, I will attempt to get them closer to the manual testing steps I performed below.
The differences:
- --entrypoint is 'cat' in the tests due to latest commits in master
- The tests relies on the fact that they are currently executing as root (dockerClient.whoAmI() will return 0:0), so the tests should work but they are not portable if the user running the tests is not root
Manual steps:
Successful Case (no --user on docker run)
-> docker run -t -d -v /sys/fs/cgroup:/sys/fs/cgroup --privileged --entrypoint /usr/sbin/init centos/systemd
1520faa10bcc31295a9ee08b8c29bf71b13f246a17296619b6a38bcfcfc3b340
ruby-2.3.1 ~ $
-> docker exec -it 1520faa10bcc31295a9ee08b8c29bf71b13f246a17296619b6a38bcfcfc3b340 bash
[root@1520faa10bcc /]# systemctl status
● 1520faa10bcc
State: running
Jobs: 0 queued
Failed: 0 units
Since: Sat 2016-09-24 18:43:08 UTC; 13s ago
CGroup: /
├─ 1 /usr/sbin/init
├─17 bash
├─30 systemctl status
├─31 systemctl status
└─system.slice
└─systemd-journald.service
└─16 /usr/lib/systemd/systemd-journald
[root@1520faa10bcc /]#
Failing Case (arbitrary --user on docker run)
-> docker run -t -d -v /sys/fs/cgroup:/sys/fs/cgroup --privileged --entrypoint /usr/sbin/init --user 1234:1234 centos/systemd
68c891f1058439740d6fb674adf66f1ee8a46dff185dde11069a90ca93c1cb4c
ruby-2.3.1 ~ $
-> docker exec -it --user 0 68c891f1058439740d6fb674adf66f1ee8a46dff185dde11069a90ca93c1cb4c bash
[root@68c891f10584 /]# systemctl status
Failed to get D-Bus connection: Operation not permitted
[root@68c891f10584 /]#
=== Starting withExecAsUser(org.jenkinsci.plugins.docker.workflow.WithContainerStepTest)
[prj #1] Started
Sep 24, 2016 1:32:45 PM hudson.ExtensionFinder$GuiceFinder$FaultTolerantScope$1 error
WARNING: Failed to instantiate optional component org.jenkinsci.plugins.workflow.steps.scm.SubversionStep$DescriptorImpl; skipping
[prj #1] [Pipeline] node
[prj #1] Running on master in /var/folders/wl/kkhnc8fx0zq1lj1_kqd1bszc0000gp/T/junit4999253668053292568/junit6145851350636653994/workspace/prj
[prj #1] [Pipeline] {
[prj #1] [Pipeline] withDockerContainer
[prj #1] $ docker run -t -d -v /sys/fs/cgroup:/sys/fs/cgroup:ro --privileged -w /var/folders/wl/kkhnc8fx0zq1lj1_kqd1bszc0000gp/T/junit4999253668053292568/junit6145851350636653994/workspace/prj -v /var/folders/wl/kkhnc8fx0zq1lj1_kqd1bszc0000gp/T/junit4999253668053292568/junit6145851350636653994/workspace/prj:/var/folders/wl/kkhnc8fx0zq1lj1_kqd1bszc0000gp/T/junit4999253668053292568/junit6145851350636653994/workspace/prj:rw -v /var/folders/wl/kkhnc8fx0zq1lj1_kqd1bszc0000gp/T/junit4999253668053292568/junit6145851350636653994/workspace/prj@tmp:/var/folders/wl/kkhnc8fx0zq1lj1_kqd1bszc0000gp/T/junit4999253668053292568/junit6145851350636653994/workspace/prj@tmp:rw -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** --entrypoint cat centos/systemd
[prj #1] [Pipeline] {
[prj #1] [Pipeline] sh
[prj #1] [prj] Running shell script
[prj #1] sh: /var/folders/wl/kkhnc8fx0zq1lj1_kqd1bszc0000gp/T/junit4999253668053292568/junit6145851350636653994/workspace/prj@tmp/durable-18bae9d0/pid: No such file or directory
[prj #1] sh: /var/folders/wl/kkhnc8fx0zq1lj1_kqd1bszc0000gp/T/junit4999253668053292568/junit6145851350636653994/workspace/prj@tmp/durable-18bae9d0/jenkins-log.txt: No such file or directory
[prj #1] sh: /var/folders/wl/kkhnc8fx0zq1lj1_kqd1bszc0000gp/T/junit4999253668053292568/junit6145851350636653994/workspace/prj@tmp/durable-18bae9d0/jenkins-result.txt: No such file or directory
[prj #1] [Pipeline] }
[prj #1] $ docker stop 323ff2f8568abe5aacea287c263290428a37c90abacbfa60862d6ff82c419f5e
[prj #1] $ docker rm -f 323ff2f8568abe5aacea287c263290428a37c90abacbfa60862d6ff82c419f5e
[prj #1] [Pipeline] // withDockerContainer
Sep 24, 2016 1:33:13 PM org.jenkinsci.plugins.workflow.job.WorkflowRun finish
INFO: prj #1 completed: FAILURE
[prj #1] [Pipeline] }
[prj #1] [Pipeline] // node
[prj #1] [Pipeline] End of Pipeline
[prj #1] ERROR: script returned exit code -2
[prj #1] Finished: FAILURE
@dweomer That sounds like it may fit your use case, but not mine. I want the containers to start up as they were built, not with an arbitrary override. I need the distinction of run vs exec due to us not running our jenkins agents as root (we have a jenkins user).
@docwhat I agree with you, but I think it may be best for you to open your own issue for that discussion. I would like to prevent this PR (which I feel addresses an immediate need of myself and the others who have commented) from getting bogged down in a design discussion. It would certainly stop the assumptions of the plugin from interfering with the container.
@antoniobeyah wrote:
@dweomer That sounds like it may fit your use case, but not mine. I want the containers to start up as they were built, not with an arbitrary override. I need the distinction of run vs exec due to us not running our jenkins agents as root (we have a jenkins user).
Specifically:
I want the containers to start up as they were built, not with an arbitrary override
This tells me that vanilla docker.inside isn't a good fit for your use case and that docker.run combined with explicit sh "docker exec -it $cid some-useful-command" is a better fit.
Also:
That sounds like it may fit your use case, but not mine.
I believe that if it weren't for the advent of the --entrypoint cat (a recent-ish addition I think) from docker.inside that it would work for you without issue, as my example above shows. This again speaks to vanilla docker.inside being a poor fit for your use case. Have you tried something like:
docker.image('centos/systemd').inside('--privileged --user root:root --volume /sys/fs/cgroup:/sys/fs/cgroup') {
sh '/usr/sbin/init &'
sh '# useful work goes here'
}
Well, wait. Of course that will not work because your image's entry-point must needs be PID 1. Sounds like docker.inside simply needs to support overrides for --user and --entrypoint and then with a little work and no custom plugin your scripts will work AND my scripts will continue to work with the already established, herein unmodified, implicit build-time contract. What I am proposing might look like:
docker.image('centos/systemd').user('root').entrypoint('/usr/sbin/init').inside('--privileged --volume /sys/fs/cgroup:/sys/fs/cgroup') {
sh '# useful work goes here'
}
I appreciate that you do not wish your PR to get bogged down in design discussion but your changes, without an opt-in (aka moot the arbitrary override with a specified one), will result in just about any Jenkinsfile that expects to be able to write to it's workspace after executing docker.inside to fail horribly. If, after review, @jglick and the other CloudBees folks decide to break such an important, albeit implicit, contract I will be sorely disappointed but at least it will have gone through said review.
@jglick wrote:
Interesting, though it would be nicer to have this be an advertised option.
I very much agree.
JENKINS-38438 was recently filed, which sounds like it is on the same topic, yet switching the --user from docker-run to docker-exec does not make any difference in that case: either way, the command fails.
JENKINS-38438 is a result of tools like ssh ignoring the value of $HOME and instead forcing de-reference of the user's home directory from /etc/passwd which fails miserably if there is no entry for the runtime UID. One assumes there is an underlying security reason for this behavior from ssh. This hints at missing support, in docker.inside and similar, for automagic setup of an "arbitrary" user, in this default case the user invoking the "build".
@antoniobeyah, I wrote:
I appreciate that you do not wish your PR to get bogged down in design discussion but your changes, without an opt-in (aka moot the arbitrary override with a specified one), will result in just about any
Jenkinsfilethat expects to be able to write to it's workspace after executingdocker.insideto fail horribly. If, after review, @jglick and the other CloudBees folks decide to break such an important, albeit implicit, contract I will be sorely disappointed but at least it will have gone through said review.
... and after reviewing your code changes, this part of what I wrote is simply wrong (bolded above):
but your changes, without an opt-in (aka moot the arbitrary override with a specified one), will result in just about any
Jenkinsfilethat expects to be able to write to it's workspace after executingdocker.insideto fail horribly
As you pointed out, however, you still need support for a custom --entrypoint which seems to fit well with supporting a custom --user.