jenkins
jenkins copied to clipboard
Provisioning hangs for jenkins_jnlp_slave on runit service
Cookbook version
4.1.2 (runit cookbook versions 3.0.0 and 1.8.0)
Chef-client version
12.15.19
Platform Details
Ubuntu Trusty 14.04.5 (AWS image ami-ca5003dd)
Scenario:
Attempting to provision a new node hangs when the cookbook executes jenkins_jnlp_slave on ruby_block[wait for jenkins-slave service socket] action run
Steps to Reproduce:
Create a cookbook recipe that calls the jenkins resource jenkins_jnlp_slave
. Ensure that runit is not previously installed on the host and that the runit::default recipe is not explicitly called by any cookbook on that host's run list.
Expected Result:
The jenkins resource creates the slave node and continues on
Actual Result:
The jenkins resource hangs forever on ruby_block[wait for jenkins-slave service socket] action run
. When killing the chef run, this is the stack trace:
Recipe: <Dynamically Defined Resource>
* remote_file[/var/chef/cache/jenkins-cli.jar] action create (up to date)
* remote_file[/var/chef/cache/update-center.json] action create (up to date)
* file[/var/chef/cache/extracted-update-center.json] action create (up to date)
* file[/var/chef/cache/jenkins-key] action create (up to date)
* directory[/mnt/jenkins] action create (up to date)
* group[jenkins] action create (up to date)
* linux_user[jenkins] action create (up to date)
* directory[/mnt/jenkins] action create (up to date)
* remote_file[/mnt/jenkins/slave.jar] action create (up to date)
* runit_service[jenkins-slave] action enable
* ruby_block[restart_service] action nothing (skipped due to action :nothing)
* ruby_block[restart_log_service] action nothing (skipped due to action :nothing)
* directory[/etc/sv/jenkins-slave] action create (up to date)
* template[/etc/sv/jenkins-slave/run] action create (up to date)
* directory[/etc/sv/jenkins-slave/log] action create (up to date)
* directory[/etc/sv/jenkins-slave/log/main] action create (up to date)
* directory[/var/log/jenkins-slave] action create (up to date)
* template[/etc/sv/jenkins-slave/log/config] action create (up to date)
* link[/var/log/jenkins-slave/config] action create (up to date)
* template[/etc/sv/jenkins-slave/log/run] action create (up to date)
* directory[/etc/sv/jenkins-slave/env] action create (up to date)
* ruby_block[zap extra env files for jenkins-slave service] action run (skipped due to only_if)
* template[/etc/sv/jenkins-slave/check] action create (skipped due to only_if)
* template[/etc/sv/jenkins-slave/finish] action create (skipped due to only_if)
* directory[/etc/sv/jenkins-slave/control] action create (up to date)
* link[/etc/init.d/jenkins-slave] action create (up to date)
* file[/etc/sv/jenkins-slave/down] action nothing (skipped due to action :nothing)
* directory[/etc/service] action create (up to date)
* link[/etc/service/jenkins-slave] action create (up to date)
* ruby_block[wait for jenkins-slave service socket] action run^C
================================================================================
Error executing action `run` on resource 'ruby_block[wait for jenkins-slave service socket]'
================================================================================
SystemExit
----------
exit
Cookbook Trace:
---------------
/var/chef/cache/cookbooks/runit/libraries/helpers.rb:79:in `sleep'
/var/chef/cache/cookbooks/runit/libraries/helpers.rb:79:in `wait_for_service'
/var/chef/cache/cookbooks/runit/libraries/provider_runit_service.rb:291:in `block (3 levels) in <class:RunitService>'
/var/chef/cache/cookbooks/compat_resource/files/lib/chef_compat/monkeypatches/chef/runner.rb:78:in `run_action'
/var/chef/cache/cookbooks/compat_resource/files/lib/chef_compat/monkeypatches/chef/runner.rb:106:in `block (2 levels) in converge'
/var/chef/cache/cookbooks/compat_resource/files/lib/chef_compat/monkeypatches/chef/runner.rb:106:in `each'
/var/chef/cache/cookbooks/compat_resource/files/lib/chef_compat/monkeypatches/chef/runner.rb:106:in `block in converge'
/var/chef/cache/cookbooks/compat_resource/files/lib/chef_compat/monkeypatches/chef/runner.rb:105:in `converge'
/var/chef/cache/cookbooks/jenkins/libraries/slave_jnlp.rb:69:in `block in <class:JenkinsJnlpSlave>'
/var/chef/cache/cookbooks/compat_resource/files/lib/chef_compat/monkeypatches/chef/runner.rb:78:in `run_action'
/var/chef/cache/cookbooks/compat_resource/files/lib/chef_compat/monkeypatches/chef/runner.rb:106:in `block (2 levels) in converge'
/var/chef/cache/cookbooks/compat_resource/files/lib/chef_compat/monkeypatches/chef/runner.rb:106:in `each'
/var/chef/cache/cookbooks/compat_resource/files/lib/chef_compat/monkeypatches/chef/runner.rb:106:in `block in converge'
/var/chef/cache/cookbooks/compat_resource/files/lib/chef_compat/monkeypatches/chef/runner.rb:105:in `converge'
Resource Declaration:
---------------------
# In /var/chef/cache/cookbooks/runit/libraries/provider_runit_service.rb
289: ruby_block "wait for #{new_resource.service_name} service socket" do
290: block do
291: wait_for_service
292: end
293: action :run
294: end
295:
Compiled Resource:
------------------
# Declared in /var/chef/cache/cookbooks/runit/libraries/provider_runit_service.rb:289:in `block in <class:RunitService>'
ruby_block("wait for jenkins-slave service socket") do
action [:run]
retries 0
retry_delay 2
default_guard_interpreter :default
block_name "wait for jenkins-slave service socket"
declared_type :ruby_block
cookbook_name "dd-jenkins"
block #<Proc:0x000000059e3c70@/var/chef/cache/cookbooks/runit/libraries/provider_runit_service.rb:290>
end
Platform:
---------
x86_64-linux
Investigating that the runit service had been started, I noticed that it was waiting ont he service but runit hadn't even been installed yet:
$ sv
The program 'sv' is currently not installed. To run 'sv' please ask your administrator to install the package 'runit'
So, I don't think the include_recipe 'runit' in service_resource is getting called in time to install runit: https://github.com/chef-cookbooks/jenkins/blob/v4.1.2/libraries/slave_jnlp.rb#L232
I can confirm that calling include_recipe 'runit'
immediately before calling the jenkins_jnlp_slave
slave resource fixes this problem.
Related to #531