community.libvirt inventory plugin: add a way to skip dormant domains

SUMMARY

Currently I have some libvirt domains that I created that are meant to be shutdown most of the time, this means doing something like: ansible -m ping all will always return errors like the following:

libvirt: Domain Config error : Requested operation is not valid: domain is not running
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: libvirt.libvirtError: Requested operation is not valid: domain is not running
nafta.hq.akdev.xyz | FAILED! => {
    "msg": "Unexpected failure during module execution.",
    "stdout": ""
}

This clutters the output and it probably makes playbooks take slightly longer as it tries and fails each command for domains that aren't running.

My proposal would be to add a new option in the config file:

plugin: community.libvirt.libvirt
uri: qemu:///system
ignore_off: True

with such configuration the plugin would just return a list of domains that are currently active. (it could warn that other domains are being ignored for awareness)

ISSUE TYPE

Feature Idea

COMPONENT NAME

community.libvirt.libvirt inventory plugin

ADDITIONAL INFORMATION

If the feature is implemented then users would be able to run playbooks without getting errors whenever they have one or more VMs that are shut down.

$ ansible -m ping all
libvirt: Domain Config error : Requested operation is not valid: domain is not running
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: libvirt.libvirtError: Requested operation is not valid: domain is not running
nafta.hq.akdev.xyz | FAILED! => {
    "msg": "Unexpected failure during module execution.",
    "stdout": ""
}

I am willing to put some time towards making this happen if no one is against having this - alternatively if there's a way of achieving what I want that I'm unaware of please let me know.

Dec 25 '21 14:12 akdev1l

Hi @akdev1l , thanks for suggesting this idea. It's an interesting one which I'll need to think about and discuss some more, I think. My first instinct is to suggest that the behaviour of the ping module when using the dynamic inventory should be the same as other inventories. In my experience, it's normal for ping to fail against a host if it is not contactable. For example, if this was a dynamic inventory from OpenStack or something which returned a list of hosts, if some of those were off it would try and then also fail, as you're seeing. The failure also means that those hosts are then taken out of the in-memory inventory and excluded from further tasks.

In terms of whether there's another way to work around it, I guess it's probably worth checking what the purpose of your ping is? Is it to see whether the hosts are fully up and running, or something else?

Depending on what you're trying to do, you could also add an ignore_errors: true to the ping task.

I think, if it were me I would probably have an initial task which got a list of all running VMs and then made sure that tasks were only run when that guest was in running state.

As a rough idea, something like this:

- name: Get list of running VMs
  virt:
    command: list_vms
    state: running
  register: result_running_vms
  run_once: true
  delegate_to: localhost

Then use this for other tasks, e.g.:

- name: Ping hosts
  ping:
  when: inventory_hostname in result_running_vms.list_vms

if that doesn't help, are you able to explain a bit more of what you're trying to achieve? Then we might be able to see if there's a workable solution for you.

Cheers, -c

Dec 27 '21 08:12 csmart

Hi, thank you for the detailed reply!

I have thought of fetching that list - but that makes me want to not use the inventory plugin (as I could then write my own plugin and fetch the list that way - though I'd lose the guest-exec goodness). I will consider it.

The ping is just an example of a task - any task. My use case may be a bit convoluted but I will try to explain:

I keep a couple VMs that claim the same PCI device - this means these VMs cannot be on at the same time, that is expected(think about these as virtualized desktop environments). I have other VMs that are just servers with no external devices.

I would like to have 1 ansible inventory that encompasses all my virtual infrastructure - but because I know those special VMs may be off at any time (expected) whenever I try to run a baseline playbook on all my VMs I will get many errors.

Currently I am importing 3 different playbooks so I guess that''s probably why I get more than one error at the beginning.

My problems are really just two:

The failed connections add latency to each new playbook that I add to my master playbook
The error report is less useful because I will always have failed hosts

I was thinking about it and maybe my original proposal is a bit narrow - maybe it would be more palatable to add an exclusion list like so?

plugin: community.libvirt.libvirt
uri: qemu:///system
ignore_list:
  - desktopvm-*

then I could name my vms with a naming convention and exclude them appropriately - this could also enable other use cases where you can have 2 separate ansible inventories running on the same hypervisor

Let me know if these ramblings did not make much sense - happy to clarify!

Dec 27 '21 16:12 akdev1l

Hi @akdev1l,

I think that the extra task to check for running VMs should still do what you want - you should be able to use the dynamic inventory and then check which of them are running.

Also, if you want a way to exclude certain hosts, then maybe consider using the standard --limit function of Ansible to do that. Note that --limit takes a comma separated list which can be either hostnames (e.g.--limit server1,server2), groups (e.g. --limit group1,group2), or a combination of both (e.g. --limit server1,group2) and it can also negate hosts (e.g. --limit 'all,!server1').

For example, on my machine with two Fedora machines, one powered on (fedora-35-on) and one powered off (fedora-35-off):

$ ansible -i inventory.yml -m ping all
libvirt: Domain Config error : Requested operation is not valid: domain is not running
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: libvirt.libvirtError: Requested operation is not valid: domain is not running
fedora-35-off | FAILED! => {
    "msg": "Unexpected failure during module execution.",
    "stdout": ""
}
fedora-35-on | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3.10"
    },
    "changed": false,
    "ping": "pong"
}

Then, with the limit:

$ ansible -i inventory.yml -m ping all --limit fedora-35-on
fedora-35-on | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3.10"
    },
    "changed": false,
    "ping": "pong"
}

Hope that helps!

-c

Dec 27 '21 23:12 csmart

@csmart mm I think limit kinda works - but it means I have to remember to do that everywhere (in all my play books vs just limiting the scope of the inventory)

I may implement a proof of concept to see how this would work - I may post updates here tomorrow

Jan 02 '22 05:01 akdev1l

Hi @akdev1l, just in case it's not clear, using --limit does limit the scope of the inventory - you don't need to change any of your playbooks. However yes, you do need to know what to limit and remember to do so.

The other method I mentioned, where you look at the running VMs and then have a check, does require you to modify your playbooks to add the when condition. However, instead of adding that check to each task, instead you could modify the in-memory inventory and add the running VMs from the list into a new host group, then run your role/plays/tasks based on your new group. This would just require one additional task (to create the group) and you wouldn't need to modify every task.

-c

Jan 02 '22 09:01 csmart

Hi - yes sorry I meant that I would have to change my playbooks from hosts: all to hosts: some-pattern(or well I can --limit at the top but then I have to remember everytime I read a playbook)

Err - maybe I'm giving the wrong impression here, my use is a bit more complex than 1 playbook or one ansible call

I actually multiple projects and multiple playbooks - everything is meant to run on the same libvirt cluster. (you can see some of my playbooks here)

So for example I have 1 project to baseline all my virtual servers - conceptually I would like my baseline.yml playbook to apply to hosts: all as it should apply to all my virtual servers

But on the same cluster I have another project plex-server and this one will connect to the plex servers and configure Plex Media Server. Conceptually in this project I would like my inventory to only contain plex servers and the playbooks would have hosts: all.

That said I do think it is entirely solvable with --limit though it feels like the functionality should be in the inventory...

For reference - the functionality I want is essentially provided by the VMware ansible inventory plugin (see here)

Specifically

# Filter VMs based upon condition
    plugin: community.vmware.vmware_vm_inventory
    strict: False
    hostname: 10.65.223.31
    username: [email protected]
    password: Esxi@123$%
    filters:
    - runtime.powerState == "poweredOn"

See the filters field

Jan 02 '22 15:01 akdev1l

I agree it would be useful, however it will take some time to implement, I expect. I'll have to find some time to dig into the code and see what might be the best way to do this.

Cheers! -c

Jan 02 '22 21:01 csmart

@csmart I can help you with some dev time - whatever hack I come up with may not be proper but I can improve it with your help

can't make promises on timeline - but I'll try to work on this :-) (I created this issue more as a consultation if the feature was wanted rather than as a feature request)

Thanks a lot for you help and quick replies btw, appreciated!

Jan 02 '22 21:01 akdev1l

@akdev1l just letting you know that there's a PR in which includes a filter by name. We could perhaps look at expanding that to include states.

Feb 01 '22 21:02 csmart

community.libvirt community.libvirt copied to clipboard

inventory plugin: add a way to skip dormant domains

SUMMARY

ISSUE TYPE

COMPONENT NAME

ADDITIONAL INFORMATION

community.libvirt
community.libvirt copied to clipboard