jenkins icon indicating copy to clipboard operation
jenkins copied to clipboard

Provisioning hangs for jenkins_jnlp_slave on runit service

Open sethrosenblum opened this issue 8 years ago • 0 comments

Cookbook version

4.1.2 (runit cookbook versions 3.0.0 and 1.8.0)

Chef-client version

12.15.19

Platform Details

Ubuntu Trusty 14.04.5 (AWS image ami-ca5003dd)

Scenario:

Attempting to provision a new node hangs when the cookbook executes jenkins_jnlp_slave on ruby_block[wait for jenkins-slave service socket] action run

Steps to Reproduce:

Create a cookbook recipe that calls the jenkins resource jenkins_jnlp_slave. Ensure that runit is not previously installed on the host and that the runit::default recipe is not explicitly called by any cookbook on that host's run list.

Expected Result:

The jenkins resource creates the slave node and continues on

Actual Result:

The jenkins resource hangs forever on ruby_block[wait for jenkins-slave service socket] action run. When killing the chef run, this is the stack trace:

  Recipe: <Dynamically Defined Resource>
    * remote_file[/var/chef/cache/jenkins-cli.jar] action create (up to date)
    * remote_file[/var/chef/cache/update-center.json] action create (up to date)
    * file[/var/chef/cache/extracted-update-center.json] action create (up to date)
    * file[/var/chef/cache/jenkins-key] action create (up to date)
    * directory[/mnt/jenkins] action create (up to date)
    * group[jenkins] action create (up to date)
    * linux_user[jenkins] action create (up to date)
    * directory[/mnt/jenkins] action create (up to date)
    * remote_file[/mnt/jenkins/slave.jar] action create (up to date)
    * runit_service[jenkins-slave] action enable
      * ruby_block[restart_service] action nothing (skipped due to action :nothing)
      * ruby_block[restart_log_service] action nothing (skipped due to action :nothing)
      * directory[/etc/sv/jenkins-slave] action create (up to date)
      * template[/etc/sv/jenkins-slave/run] action create (up to date)
      * directory[/etc/sv/jenkins-slave/log] action create (up to date)
      * directory[/etc/sv/jenkins-slave/log/main] action create (up to date)
      * directory[/var/log/jenkins-slave] action create (up to date)
      * template[/etc/sv/jenkins-slave/log/config] action create (up to date)
      * link[/var/log/jenkins-slave/config] action create (up to date)
      * template[/etc/sv/jenkins-slave/log/run] action create (up to date)
      * directory[/etc/sv/jenkins-slave/env] action create (up to date)
      * ruby_block[zap extra env files for jenkins-slave service] action run (skipped due to only_if)
      * template[/etc/sv/jenkins-slave/check] action create (skipped due to only_if)
      * template[/etc/sv/jenkins-slave/finish] action create (skipped due to only_if)
      * directory[/etc/sv/jenkins-slave/control] action create (up to date)
      * link[/etc/init.d/jenkins-slave] action create (up to date)
      * file[/etc/sv/jenkins-slave/down] action nothing (skipped due to action :nothing)
      * directory[/etc/service] action create (up to date)
      * link[/etc/service/jenkins-slave] action create (up to date)
      * ruby_block[wait for jenkins-slave service socket] action run^C

        ================================================================================
        Error executing action `run` on resource 'ruby_block[wait for jenkins-slave service socket]'
        ================================================================================

        SystemExit
        ----------
        exit

        Cookbook Trace:
        ---------------
        /var/chef/cache/cookbooks/runit/libraries/helpers.rb:79:in `sleep'
        /var/chef/cache/cookbooks/runit/libraries/helpers.rb:79:in `wait_for_service'
        /var/chef/cache/cookbooks/runit/libraries/provider_runit_service.rb:291:in `block (3 levels) in <class:RunitService>'
        /var/chef/cache/cookbooks/compat_resource/files/lib/chef_compat/monkeypatches/chef/runner.rb:78:in `run_action'
        /var/chef/cache/cookbooks/compat_resource/files/lib/chef_compat/monkeypatches/chef/runner.rb:106:in `block (2 levels) in converge'
        /var/chef/cache/cookbooks/compat_resource/files/lib/chef_compat/monkeypatches/chef/runner.rb:106:in `each'
        /var/chef/cache/cookbooks/compat_resource/files/lib/chef_compat/monkeypatches/chef/runner.rb:106:in `block in converge'
        /var/chef/cache/cookbooks/compat_resource/files/lib/chef_compat/monkeypatches/chef/runner.rb:105:in `converge'
        /var/chef/cache/cookbooks/jenkins/libraries/slave_jnlp.rb:69:in `block in <class:JenkinsJnlpSlave>'
        /var/chef/cache/cookbooks/compat_resource/files/lib/chef_compat/monkeypatches/chef/runner.rb:78:in `run_action'
        /var/chef/cache/cookbooks/compat_resource/files/lib/chef_compat/monkeypatches/chef/runner.rb:106:in `block (2 levels) in converge'
        /var/chef/cache/cookbooks/compat_resource/files/lib/chef_compat/monkeypatches/chef/runner.rb:106:in `each'
        /var/chef/cache/cookbooks/compat_resource/files/lib/chef_compat/monkeypatches/chef/runner.rb:106:in `block in converge'
        /var/chef/cache/cookbooks/compat_resource/files/lib/chef_compat/monkeypatches/chef/runner.rb:105:in `converge'

        Resource Declaration:
        ---------------------
        # In /var/chef/cache/cookbooks/runit/libraries/provider_runit_service.rb

        289:         ruby_block "wait for #{new_resource.service_name} service socket" do
        290:           block do
        291:             wait_for_service
        292:           end
        293:           action :run
        294:         end
        295: 

        Compiled Resource:
        ------------------
        # Declared in /var/chef/cache/cookbooks/runit/libraries/provider_runit_service.rb:289:in `block in <class:RunitService>'

        ruby_block("wait for jenkins-slave service socket") do
          action [:run]
          retries 0
          retry_delay 2
          default_guard_interpreter :default
          block_name "wait for jenkins-slave service socket"
          declared_type :ruby_block
          cookbook_name "dd-jenkins"
          block #<Proc:0x000000059e3c70@/var/chef/cache/cookbooks/runit/libraries/provider_runit_service.rb:290>
        end

        Platform:
        ---------
        x86_64-linux

Investigating that the runit service had been started, I noticed that it was waiting ont he service but runit hadn't even been installed yet:

$ sv
The program 'sv' is currently not installed. To run 'sv' please ask your administrator to install the package 'runit'

So, I don't think the include_recipe 'runit' in service_resource is getting called in time to install runit: https://github.com/chef-cookbooks/jenkins/blob/v4.1.2/libraries/slave_jnlp.rb#L232

I can confirm that calling include_recipe 'runit' immediately before calling the jenkins_jnlp_slave slave resource fixes this problem.

Related to #531

sethrosenblum avatar Nov 07 '16 19:11 sethrosenblum