milkcheck icon indicating copy to clipboard operation
milkcheck copied to clipboard

Retcode not properly forwarded through Skipped services

Open fihuer opened this issue 7 years ago • 4 comments

Hi, Here's a bug report about an incorrect return code when a requirement of a skipped service fails.

Cheers, Romain F.

Configuration file

---
services:
  entry_point:
     desc: "Dummy entry point"
     require: [ action_successful_on_node2000 ]
     target: "node2000"
     actions:
        'status':
           cmd: /bin/true

  action_successful_on_node2000:
     desc: "Action only successful on node2000"
     target: "@contains_node2000"
     actions:
         'status':
           cmd: '[[ $HOSTNAME == "node2000" ]]'

Expected behavior

  • milkcheck -n node2000 entry_point status should return 0
  • milkcheck -n node2000 status should return 0
  • milkcheck -n node2001 entry_point status should return 6
  • milkcheck -n node2001 status should return 6

Current behavior

  • [:+1:] milkcheck -n node2000 entry_point status returns 0
  • [:+1:] milkcheck -n node2000 status should returns 0
  • [:-1:] milkcheck -n node2001 entry_point status returns 0
Debug output
# milkcheck --debug -c . -n node2001 entry_point status
[11:41:13] DEBUG    - Configuration
nodeps: False
dryrun: False
verbosity: 5
only_nodes: node2001
summary: False
fanout: 128
reverse_actions: ['stop']
debug: True
config_dir: .
status action_successful_on_node2000 on node2001
 > [[ $HOSTNAME == "node2000" ]]
status action_successful_on_node2000 ran in 0.21 s
 > node2001 exited with 1
action_successful_on_node2000 - Action only successful on node2000                                         [  ERROR  ]
entry_point - Dummy entry point                                                                            [ SKIPPED ]
# echo $?
0
  • [:-1:] milkcheck -n node2001 statusreturn 0
Debug output
#milkcheck --debug -c . -n node2001 status
[11:41:56] DEBUG    - Configuration
nodeps: False
dryrun: False
verbosity: 5
only_nodes: node2001
summary: False
fanout: 128
reverse_actions: ['stop']
debug: True
config_dir: .
status action_successful_on_node2000 on node2001
 > [[ $HOSTNAME == "node2000" ]]
status action_successful_on_node2000 ran in 0.21 s
 > node2001 exited with 1
action_successful_on_node2000 - Action only successful on node2000                                         [  ERROR  ]
entry_point - Dummy entry point                                                                            [ SKIPPED ]
# echo $?
0

Details

# milkcheck --version
milkcheck 1.0
# rpm -q milkcheck
milkcheck-1.0-4.el7.centos.noarch
# uname -a
Linux 3.10.0-327.36.3.el7.x86_64 #1 SMP Mon Oct 24 16:09:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
# facter os
{
  architecture => "x86_64",
  family => "RedHat",
  hardware => "x86_64",
  name => "CentOS",
  release => {
    full => "7.2.1511",
    major => "7",
    minor => "2"
  },
  selinux => {
    enabled => false
  }
}

fihuer avatar Aug 30 '17 10:08 fihuer

Service could only have one status, which is skipped in this case. We cannot put DEP_ERROR and SKIPPED here as a status. So we lose the DEP_ERROR info at this step. The error is not propagated and so return code is 0...

Code change is not trivial

degremont avatar Aug 30 '17 12:08 degremont

Reproduced without SKIPPED services:

$ cat conf/bad_retcode/example.yaml 
services:
    foo:
        desc: Simplest service foo
        actions:
            test:
                cmd: /bin/false
    bar:
        desc: Simplest service bar
        require_weak: [ foo ]
        actions:
            test:
                cmd: /bin/true
$ ./scripts/milkcheck -c conf/bad_retcode/ test
test foo ran in 0.01 s
 > localhost exited with 1
foo - Simplest service foo                                                                                 [  ERROR  ]
bar - Simplest service bar                                                                                 [    OK   ]           

$ echo $?
0

We should have 6 instead of 0 here...

cedeyn avatar Oct 05 '17 12:10 cedeyn

@cedeyn thanks for the report. I think the problem is different. This is not the same issue as above and will probably require a different fix.

Could you confirm the behavior if you add a target=localhost on both services?

degremont avatar Oct 05 '17 14:10 degremont

Here it is :

$ cat conf/bad_retcode/example.yaml
services:
    foo:
        desc: Simplest service foo
        target: localhost
        actions:
            test:
                cmd: /bin/false
    bar:
        desc: Simplest service bar
        target: localhost
        require_weak: [ foo ]
        actions:
            test:
                cmd: /bin/true

Same behaviour:

$ ./scripts/milkcheck -c conf/bad_retcode/ test
test foo ran in 0.96 s
 > localhost exited with 1
foo - Simplest service foo                                               [  ERROR  ]
bar - Simplest service bar                                               [    OK   ]           

$ echo $?
0

cedeyn avatar Oct 05 '17 16:10 cedeyn