community.general icon indicating copy to clipboard operation
community.general copied to clipboard

Redfish: Check Service Availability

Open mraineri opened this issue 5 months ago • 5 comments

Summary

Would like to add a command to let a user check if a service is available. This would be useful if a user performs operations on a service that would result in the service to become unavailable for a short period of time (like resetting a BMC).

Issue Type

Feature Idea

Component Name

redfish_info, redfish_command

Additional Information

I'm proposing two things:

First, add a new command to redfish_info to check if a service is available. This would perform a GET on the service root, and if the response comes back 200 OK, then we can mark it as available; otherwise if there's a timeout or some other HTTP response, then it's not available.

Sample playbook and response (notice there's no username and password since the service root is available without authentication):

---
- hosts: all
  gather_facts: false
  vars:
    baseuri: BMC_IP
  tasks:
  - name: Get if the service is available
    community.general.redfish_info:
      category: Service
      command: CheckAvailability
      baseuri: "{{ baseuri }}"
    register: redfish_results
  - debug:
      var: redfish_results
TASK [debug] *******************************************************************************************************************************************************
ok: [localhost] => {
    "redfish_results": {
        "ansible_facts": {
            "discovered_interpreter_python": "/usr/bin/python3"
        },
        "changed": false,
        "failed": false,
        "redfish_facts": {
            "service": {
                "ret": true,
                "available": true
            }
        }
    }
}

The second proposal would be to embed this "service available" check as part of an option to the various power/reset commands for the Manager category in redfish_command. Two options to enable this if the user desires it: wait and wait_timeout, where wait is a boolean to enable this extra check, and wait_timeout dictates how long to monitor for the service to be available before returning. For example:

---
- hosts: all
  gather_facts: false
  vars:
    username: root
    password: root
    baseuri: BMC_IP
    default_uri_timeout: 5
    default_uri_retries: 5
  tasks:
  - name: Reset the manager
    community.general.redfish_command:
      category: Manager
      command: PowerForceRestart
      baseuri: "{{ baseuri }}"
      username: "{{ username }}"
      password: "{{ password }}"
      wait: True
      wait_timeout: 120
    retries: "{{ default_uri_retries }}"
    register: redfish_results
  - debug:
      var: redfish_results

Code of Conduct

  • [X] I agree to follow the Ansible Code of Conduct

mraineri avatar Mar 01 '24 13:03 mraineri