core icon indicating copy to clipboard operation
core copied to clipboard

Cannot create a coherent snapshot on KVM via quiescing when Zenarmor is installed

Open deajan opened this issue 1 year ago • 1 comments

Important notices

Before you add a new report, we ask you kindly to acknowledge the following:

  • [X] I have read the contributing guide lines at https://github.com/opnsense/core/blob/master/CONTRIBUTING.md
  • [X] I am convinced that my issue is new after having checked both open and closed issues at https://github.com/opnsense/core/issues?q=is%3Aissue

Describe the bug

Running OPNSense on KVM, I cannot create a quiesce snapshot via libvirt:

virsh snapshot-create opnsense.local --disk-only --atomic --quiesce

will have the following error:

error: internal error: unable to execute QEMU agent command 'guest-fsfreeze-freeze': failed to freeze /usr/local/zenarmor/output/active/temp: Resource deadlock avoided

Removing --quiesce from the libvirt command works.

To Reproduce

Steps to reproduce the behavior:

  1. Have OPNSense installed with Zenarmor
  2. Create a quiesce snapshot via KVM

Expected behavior

Quiescing should run all necessary pre snapshot freeze and thaw scripts.

Additional context

I've tried to find the necessary freeze/thaw scripts in OPNsense in order to exclude ramdisks, in our case /usr/local/zenarmor/output/active/temp from quiescing. I've also tried to find the freeze/thaw scripts in order to suspend Zenarmor service until the snapshot is done.

Couldn't find any relevant info in OS.

Running qemu-ga in OPNSense suggests that it will read the freeze/thaw script in /usr/local/bin/../etc/qemu/fsfreeze-hook if found.

I've made the following changes in OPNSense:

In /etc/rc.conf.d/qemu_guest_agent:

- qemu_guest_agent_flags="-d -l /var/log/qemu-ga.log"
+ qemu_guest_agent_flags="-d -l /var/log/qemu-ga.log -F/usr/local/etc/qemu/fsfreeze-hook"

Then I created the following script in /usr/local/etc/qemu/fsfreeze-hook and made it executable:

#!/bin/sh

LOG_FILE=/var/log/qemu-ga.log

# Static device name found in /usr/local/etc/rc.d/eastpect
ZENARMOR_RAMDISK="/dev/md43"
ZENARMOR_RAMDISK_MOUNTPOINT="/usr/local/zenarmor/output/active/temp"

log () {
        echo "$1" >> "${LOG_FILE}";
}

case "$1" in
        "freeze")
                log "Launching freeze operations"
                if [ -d "${ZENARMOR_RAMDISK_MOUNTPOINT}" ]; then
                       log "Zenarmor installed, Stopping engine"
                        zenarmorctl engine stop >> "${LOG_FILE}" 2>&1
                        umount "${ZENARMOR_RAMDISK_MOUNTPOINT}" >> "${LOG_FILE}" 2>&1
                       sleep 1
                fi
                # Return 0 regardless of state, since a pre-stopped engine might return a false code
                log "Freeze operation done"
                exit 0
                ;;
        "thaw")
                log "Launching thaw operations"
                if [ -d "${ZENARMOR_RAMDISK_MOUNTPOINT}" ]; then
                       log "Zenarmor installed, starting engine"
                        mount "${ZENARMOR_RAMDISK}" "${ZENARMOR_RAMDISK_MOUNTPOINT}" >> "${LOG_FILE}" 2>&1
                        zenarmorctl engine start  >> "${LOG_FILE}" 2>&1
                       sleep 1
                fi
                log "Thaw operation done"
                exit 0
                ;;
        *)
                log "No options given. Nothing will happen. Options are 'freeze' or 'thaw'"
                exit 1
                ;;
esac

So far so good, I can now use --quiesce to make my snapshots application aware. I am more than willing to make a PR for this issue, if @fichtner or @AdSchellevis could have a quick look just to make sure I didn't commit any errors, especially since I don't know if /etc/rc.conf.d/qemu_guest_agent is generated on boot.

Also, should I make this PR for qemu_guest_agent plugin instead of core ?

Thanks ;)

Relevant forum entry: https://forum.opnsense.org/index.php?topic=38943.0

Environment

I've tried this with all OPNsense versions from 22.7 up to recent 24.7_5, on multiple hosts, all with KVM.

deajan avatar Jul 29 '24 10:07 deajan

Discovered an issue with qemu-ga, see https://github.com/opnsense/plugins/issues/4148 So this is now in standby mode.

deajan avatar Aug 03 '24 10:08 deajan

This issue has been automatically timed-out (after 180 days of inactivity).

For more information about the policies for this repository, please read https://github.com/opnsense/core/blob/master/CONTRIBUTING.md for further details.

If someone wants to step up and work on this issue, just let us know, so we can reopen the issue and assign an owner to it.

OPNsense-bot avatar Jan 25 '25 10:01 OPNsense-bot