xen-orchestra icon indicating copy to clipboard operation
xen-orchestra copied to clipboard

File level restoration not working on LVM partition

Open andrew64k opened this issue 2 years ago • 18 comments

Are you using XOA or XO from the sources? BOTH

If XOA: - which release channel? LATEST 5.86.1

If XO from the sources: COMMIT [bcc62]

Describe the bug As discussed in forum post 7722 Attempting to do a file level restore from a delta backup on a VM with a EXT4 filesystem on a single LVM partition does not work. XO does not correctly activate the backed up LVM partition (from the VHD on FUSE) before mounting the EXT4 filesystem on that LVM partition. XO generates an error: xo-server: mount: /tmp/e01vgsdwkdt: unknown filesystem type 'LVM2_member'.

To Reproduce Steps to reproduce the behavior:

  1. Install Ubuntu 20.04 LTS with a default install that includes a single main LVM partition with root/ EXT4 filesystem.
  2. Backup with XO/XOA using Delta Backup (example on S3)
  3. Use File Restore
  4. Select backup (only one)
  5. Select disk (only one)
  6. Select partition (last/largest one that is the main data partition)
  7. XO will display red triangle and NOT display any files
  8. Error in log (from XO source):
Sep 11 07:55:26 xo1 systemd[1]: Starting LVM event activation on device 7:1...
Sep 11 07:55:26 xo1 lvm[24697]:   pvscan[24697] /dev/loop1 excluded by filters: device is too small (pv_min_size).
Sep 11 07:55:26 xo1 systemd[1]: Started /sbin/lvm pvscan --cache 7:1.
Sep 11 07:55:26 xo1 lvm[24699]:   pvscan[24699] /dev/loop1 excluded by filters: device is too small (pv_min_size).
Sep 11 07:55:26 xo1 systemd[1]: lvm2-pvscan@7:1.service: Main process exited, code=killed, status=15/TERM
Sep 11 07:55:26 xo1 systemd[1]: lvm2-pvscan@7:1.service: Failed with result 'signal'.
Sep 11 07:55:26 xo1 systemd[1]: Stopped LVM event activation on device 7:1.
Sep 11 07:55:26 xo1 systemd[1]: run-r0f8c91fc21104cbdb590831d168f84f4.service: Succeeded.
Sep 11 07:55:26 xo1 systemd[1]: Starting LVM event activation on device 7:1...
Sep 11 07:55:26 xo1 lvm[24703]:   pvscan[24703] /dev/loop1 excluded by filters: device is too small (pv_min_size).
Sep 11 07:55:26 xo1 xo-server[492]: 2023-09-11T11:55:26.995Z xo:api WARN admin | backupNg.listFiles(...) [233ms] =!> Error: Command failed: mount --options=loop,ro,norecovery,sizelimit=20397948928,offset=1075838976 --source=/tmp/oa2kpvm9h88/vhd0 --target=/tmp/066w38w84glh
Sep 11 07:55:26 xo1 xo-server[492]: mount: /tmp/066w38w84glh: unknown filesystem type 'LVM2_member'.
Sep 11 07:55:27 xo1 systemd[1]: Started /sbin/lvm pvscan --cache 7:1.
Sep 11 07:55:27 xo1 lvm[24705]:   pvscan[24705] /dev/loop1 excluded by filters: device is too small (pv_min_size).
Sep 11 07:55:27 xo1 systemd[1]: lvm2-pvscan@7:1.service: Main process exited, code=killed, status=15/TERM
Sep 11 07:55:27 xo1 systemd[1]: lvm2-pvscan@7:1.service: Failed with result 'signal'.
Sep 11 07:55:27 xo1 systemd[1]: Stopped LVM event activation on device 7:1.
Sep 11 07:55:27 xo1 systemd[1]: run-r998a76a4dba2435595a3f8059ae11523.service: Succeeded.

Expected behavior Expecting VM root/ filesystem to be available and show file list.

Environment (please provide the following information): XO/XOA Linux Debian 11 Node v18.17.1 XCP-ng 8.2.1 (updated)

Additional context Data from delta backup IS valid and can be manually accessed on the XO server from the FUSE mounted VHD.

andrew64k avatar Sep 11 '23 12:09 andrew64k

Hello,

I encountered the same issue in a new lab environment that I created yesterday. Here are the steps I followed:

  1. Installed xcp-ng.
  2. Installed Xen Orchestra (xo) on an Ubuntu Server 22.04 using the instructions from https://github.com/ronivay/XenOrchestraInstallerUpdater.git.
  3. Updated all packages to their latest versions.
  4. Created a new virtual machine (VM) using Ubuntu Server 22.04, with a 10GB disk, using all default settings (one partition, no encryption, separate boot partition).
  5. Ran a delta backup of the VM twice.
  6. Attempted to restore a file, and I encountered this issue.

The problem is that I can successfully retrieve the file list for the /boot partition during the restore process. However, when attempting to restore files from the / partition (LVM), I encounter a red triangle error, similar to the issues reported in the known bugs.

Please let me know if you require any additional information or if there are any steps I can take to further diagnose this issue.

vhsantos avatar Sep 12 '23 13:09 vhsantos

XO file restore implementation should follow this documentation: https://github.com/vatesfr/xen-orchestra/blob/master/packages/xo-server/docs/file-restoration.md

If you can try to do it manually and report any problems you have, that could speed up the investigation/resolution :slightly_smiling_face:

julien-f avatar Sep 12 '23 13:09 julien-f

Hello @julien-f,

I was making some tests, and I got this errors on the logs:

Sep 12 21:44:00 xo-build kernel: [  425.353335] loop5: detected capacity change from 0 to 16777216
Sep 12 21:44:00 xo-build kernel: [  425.532315] loop5: detected capacity change from 0 to 16777216
Sep 12 21:44:00 xo-build xo-server[680]: 2023-09-12T21:44:00.783Z xo:api WARN [email protected] | backupNg.listFiles(...) [365ms] =!> Error: Command failed: mount --options=loop,ro,norecovery,sizelimit=8589934592,offset=1075838976 --source=/tmp/5iq3wypmdz3/vhd0 --target=/tmp/64sz2gh5hsr
Sep 12 21:44:00 xo-build xo-server[680]: mount: /tmp/64sz2gh5hsr: unknown filesystem type 'LVM2_member'.

I initially suspected that the problem might be related to having the same Volume Group (VG) name for both Xen Orchestra (XO) and the VMs. To test this hypothesis, I took the following steps:

  1. Renamed the VG name on the Xen Orchestra side.
  2. Tested the issue again, but unfortunately, the problem persisted.

Determined to investigate further, I performed the following additional steps:

  1. Created a new VM.
  2. Set the VG name for this VM to "vg-diff."
  3. Conducted a delta backup on this new VM.
  4. Tested the file restore process once more.

Regrettably, I encountered the same error during this test too. This outcome suggests that having the same VG name on the XO and VMs create a problem, but this does not appear to be the only source of problems related to this issue.

To provide more context, I'm including some output from my tests, which followed the guide you recommended.

I hope this information can help to find the root cause of the problem. If you have any further ideas or suggestions, please let me know.

First round of tests (vg on XO and VM is the same).

# mount --options=loop,ro,norecovery,sizelimit=19569573888,offset=1904214016 --source=/tmp/jfh1dam14nd/vhd0 --target=/tmp/vhs/
mount: /tmp/vhs: unknown filesystem type 'LVM2_member'.

# vgs
  VG        #PV #LV #SN Attr   VSize  VFree
  ubuntu-vg   1   2   0 wz--n- 18.22g    0 

# vhdimount /tmp/zl4v0s4qh3a/vhd0 /tmp/vhd-mount
vhdimount 20210425

Unable to open source image
libvhdi_file_footer_read_data: unsupported signature.
libvhdi_file_footer_read_file_io_handle: unable to read file footer.
libvhdi_internal_file_open_read: unable to read file footer.
libvhdi_file_open_file_io_handle: unable to read from file IO handle.
libvhdi_file_open: unable to open file: /tmp/zl4v0s4qh3a/vhd0.
mount_handle_open: unable to open file.

# ln -s /tmp/ys35z8zwci/vhd0 /tmp/vhd-mount

# partx --bytes --output=NR,START,SIZE,NAME,UUID,TYPE --pairs /tmp/vhd-mount/vhd0 
NR="1" START="2048" SIZE="1048576" NAME="" UUID="ace9ab73-07c7-4e2c-9e7d-5383c9f7a056" TYPE="21686148-6449-6e6f-744e-656564454649"
NR="2" START="4096" SIZE="1902116864" NAME="" UUID="dca8f531-b4d6-4480-8d9f-520c6449b836" TYPE="0fc63daf-8483-4772-8e79-3d69d8477de4"
NR="3" START="3719168" SIZE="19569573888" NAME="" UUID="864dd86a-7952-4b26-8fad-dc8c6ca5b88e" TYPE="0fc63daf-8483-4772-8e79-3d69d8477de4"

root@xo-build:/tmp# echo $?
0

# export  START="3719168"
# export SIZE="19569573888"

# losetup -o $(($START * 512)) --sizelimit $(($SIZE)) --show -f /tmp/vhd-mount/vhd0 
/dev/loop3

# pvscan --cache /dev/loop3
  pvscan[21957] PV /dev/loop3 is duplicate for PVID qO9UeEXE7vehZIhwEyY8ov5nlcjsFxr6 on 7:3 and 202:3.
  pvscan[21957] PV /dev/loop3 failed to create online file.

# pvdisplay 
  WARNING: Not using device /dev/loop3 for PV qO9UeE-XE7v-ehZI-hwEy-Y8ov-5nlc-jsFxr6.
  WARNING: PV qO9UeE-XE7v-ehZI-hwEy-Y8ov-5nlc-jsFxr6 prefers device /dev/xvda3 because device is used by LV.
  --- Physical volume ---
  PV Name               /dev/xvda3
  VG Name               ubuntu-vg-vhs
  PV Size               <18.23 GiB / not usable 3.00 MiB
  Allocatable           yes (but full)
  PE Size               4.00 MiB
  Total PE              4665
  Free PE               0
  Allocated PE          4665
  PV UUID               qO9UeE-XE7v-ehZI-hwEy-Y8ov-5nlc-jsFxr6

Second round of tests (vg on XO and VM have different names).

root@xo-build:~# mkdir /tmp/vhd-mount

root@xo-build:~# partx --bytes --output=NR,START,SIZE,NAME,UUID,TYPE --pairs /tmp/4g44rhfteym/vhd0
NR="1" START="2048" SIZE="1048576" NAME="" UUID="9258eda0-449f-4dd7-ac2d-416dc83451db" TYPE="21686148-6449-6e6f-744e-656564454649"
NR="2" START="4096" SIZE="1073741824" NAME="" UUID="c11e2931-b154-471b-9be6-835f416ad589" TYPE="0fc63daf-8483-4772-8e79-3d69d8477de4"
NR="3" START="2101248" SIZE="8589934592" NAME="" UUID="d0621000-2362-4d55-818d-fdf602f3b604" TYPE="0fc63daf-8483-4772-8e79-3d69d8477de4"

root@xo-build:~# export START="2101248"
root@xo-build:~# export SIZE="8589934592"
root@xo-build:~# echo $?
0

root@xo-build:~# losetup -o $(($START * 512)) --sizelimit $(($SIZE)) --show -f /tmp/4g44rhfteym/vhd0
/dev/loop3

root@xo-build:~# pvscan --cache /dev/loop3
  pvscan[1201] PV /dev/loop3 online.

root@xo-build:~# pvs
  PV         VG            Fmt  Attr PSize  PFree
  /dev/loop3 diff-vg       lvm2 a--  <8.00g    0 
  /dev/xvda3 ubuntu-vg-vhs lvm2 a--  18.22g    0 

root@xo-build:~# pvs --noheading --nosuffix --nameprefixes --unbuffered --units b -o vg_name /dev/loop3
  LVM2_VG_NAME='diff-vg'

root@xo-build:~# vgchange -an  diff-vg
  0 logical volume(s) in volume group "diff-vg" now active

root@xo-build:~# lvs
  LV        VG            Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  lv-0      diff-vg       -wi------- <8.00g                                                    
  lv-0      ubuntu-vg-vhs -wi-ao----  1.22g                                                    
  ubuntu-lv ubuntu-vg-vhs -wi-ao---- 17.00g                                                    

root@xo-build:~# vgchange -ay  diff-vg
  1 logical volume(s) in volume group "diff-vg" now active

root@xo-build:~# lvs
  LV        VG            Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  lv-0      diff-vg       -wi-a----- <8.00g                                                    
  lv-0      ubuntu-vg-vhs -wi-ao----  1.22g                                                    
  ubuntu-lv ubuntu-vg-vhs -wi-ao---- 17.00g                                                    
root@xo-build:~# lvs  --noheading --nosuffix --nameprefixes --unbuffered --units b -o lv_name,lv_path
  LVM2_LV_NAME='ubuntu-lv' LVM2_LV_PATH='/dev/ubuntu-vg-vhs/ubuntu-lv'
  LVM2_LV_NAME='lv-0' LVM2_LV_PATH='/dev/ubuntu-vg-vhs/lv-0'
  LVM2_LV_NAME='lv-0' LVM2_LV_PATH='/dev/diff-vg/lv-0'

root@xo-build:~# mount --options=loop,ro,norecovery /dev/diff-vg/lv-0 /mnt/

root@xo-build:~# ls /mnt/
bin   dev  home  lib32  libx32      media  opt   root  sbin  srv       sys  usr
boot  etc  lib   lib64  lost+found  mnt    proc  run   snap  swap.img  tmp  var

root@xo-build:~# 

vhsantos avatar Sep 12 '23 22:09 vhsantos

I have basically the same results. I can access the data manually. As XO on Debian 11 is not using LVM there are no name conflicts.

Expose VHD disk as block device

I use XO File Restore, select the VM, select the drive.
This gives me access to the VHD backup block file from the S3 storage.

# ls -l /tmp/khil8igii8o/
-rw-r--r-- 1 root root 21474836480 Sep 12 19:44 vhd0

List available partitions

# partx --bytes --output=NR,START,SIZE,NAME,UUID,TYPE --pairs /tmp/khil8igii8o/vhd0
NR="1" START="2048" SIZE="1048576" NAME="" UUID="cdeca8d9-23e1-4fd0-83a7-7fe395690c36" TYPE="21686148-6449-6e6f-744e-656564454649"
NR="2" START="4096" SIZE="1073741824" NAME="" UUID="d2b5aca4-9ab0-4a25-8946-a65914abf6c4" TYPE="0fc63daf-8483-4772-8e79-3d69d8477de4"
NR="3" START="2101248" SIZE="20397948928" NAME="" UUID="d3ebe1b8-7528-4077-9128-ade9bcf0f2c8" TYPE="0fc63daf-8483-4772-8e79-3d69d8477de4"

Mount LVM physical volume Using partition 3 with sizes from above

# losetup -o $(($START * 512)) --sizelimit $(($SIZE)) --show -f /tmp/khil8igii8o/vhd0
/dev/loop0
# pvscan --cache /dev/loop0
  pvscan[766] PV /dev/loop0 online.

List available LVM logical volumes

# pvs --noheading --nosuffix --nameprefixes --unbuffered --units b -o lv_name,lv_path,lv_size,vg_name /dev/loop0
  LVM2_LV_NAME='ubuntu-lv' LVM2_LV_PATH='/dev/ubuntu-vg/ubuntu-lv' LVM2_LV_SIZE='20396900352' LVM2_VG_NAME='ubuntu-vg'

Mount LVM logical volume

# vgchange -ay ubuntu-vg
  1 logical volume(s) in volume group "ubuntu-vg" now active
# lvs  --noheading --nosuffix --nameprefixes --unbuffered --units b -o lv_name,lv_path
  LVM2_LV_NAME='ubuntu-lv' LVM2_LV_PATH='/dev/ubuntu-vg/ubuntu-lv'

Mount block device FAILS because it's not filesystem on a block device:

# mount --options=loop,ro,norecovery,offset=$(($START * 512)),sizelimit=$(($SIZE)) --source=/tmp/2p1qr2spzgo/vhd0 --target=/tmp/block-mount
mount: /tmp/block-mount: unknown filesystem type 'LVM2_member'.

Mount FS from LVM works:

# mount /dev/ubuntu-vg/ubuntu-lv --options=loop,ro,norecovery  --target=/tmp/block-mount
# ls /tmp/block-mount/
bin    dev   lib    libx32      mnt   root  snap      sys  var
boot   etc   lib32  lost+found  opt   run   srv       tmp
cdrom  home  lib64  media       proc  sbin  swap.img  usr

Done...

andrew64k avatar Sep 13 '23 00:09 andrew64k

Hello there,

Could you identify/recreate the issue for your side ? There is something else that we can do to help on this ?

vhsantos avatar Sep 20 '23 11:09 vhsantos

Investigation is in progress by @julien-f

olivierlambert avatar Sep 20 '23 11:09 olivierlambert

@andrew64k "As XO on Debian 11 is not using LVM there are no name conflicts."

Does this mean that a mitigation to the issue is to rebuild xo from source on ext4 without LVM?

kaywoz avatar Nov 29 '23 20:11 kaywoz

Not using LVM for XO helps resolve a possible name conflict, but it does not resolve restore failure.

andrew64k avatar Nov 29 '23 20:11 andrew64k

That is sad because it is a basic feature of the system and it is not working after yet !! :-(

vhsantos avatar Nov 29 '23 21:11 vhsantos

So it does work on XOA or XO sources on a non-LVM system, do you confirm?

olivierlambert avatar Nov 30 '23 07:11 olivierlambert

I'll try to verify later tonight on the current XOA build and source commit, swamped with work atm.

kaywoz avatar Nov 30 '23 08:11 kaywoz

I'll try to verify later tonight on the current XOA build and source commit, swamped with work atm.

Yes, the issue is present for me on xoa 5.89.0 and xo from source 2dcb5. The issue is not present when restoring a non-LVM system.

Ubuntu 23.04 with only ext4 disk works, Ubuntu 23.04 with LVM default+ext4 does not. Error message is always a variant of; "message": "Command failed: mount --options=loop,ro,norecovery,sizelimit=19569573888,offset=1904214016 --source=/tmp/gu8sy5ycepr/vhd0 --target=/tmp/q1wyyfrky5b mount: /tmp/q1wyyfrky5b: unknown filesystem type 'LVM2_member

I will change my templates to not use LVM as I generally dont use it, but it would be nice that the root cause would be fixed from the xo perspective.

PS. I can provide the vm images used if you need to but they are really default Ubuntu-machines except for the LVM-part.

kaywoz avatar Nov 30 '23 20:11 kaywoz

Hey guys,

Do you have any update about this bug that affect an important module ??

Thanks

vhsantos avatar Mar 18 '24 12:03 vhsantos

Can you try again with a freshly deployed XOA?

olivierlambert avatar Mar 20 '24 07:03 olivierlambert

Can you try again with a freshly deployed XOA?

I just tried on the latest build using Rocky Linux 9 and it appears to still be failing with the following error.

Mar 20 11:59:11 xo xo-server[40470]: 2024-03-20T15:59:11.167Z xo:api WARN username| backupNg.listFiles(...) [224ms] =!> Error: Command failed: mount --options=loop,ro,norecovery,sizelimit=36949721088,offset=1703936000 --source=/tmp/6866t7miyam/vhd0 --target=/tmp/3xpl4dzr0h9 Mar 20 11:59:11 xo xo-server[40470]: mount: /tmp/3xpl4dzr0h9: unknown filesystem type 'LVM2_member'.

Screenshot 2024-03-20 115835 Screenshot 2024-03-20 115917

delgado23 avatar Mar 20 '24 16:03 delgado23

That's not what I asked 🙂

olivierlambert avatar Mar 21 '24 06:03 olivierlambert

Sorry about that! I miss understood the question.

delgado23 avatar Mar 21 '24 11:03 delgado23

(currently investigated by @fbeauchamp on the forum)

olivierlambert avatar Mar 30 '24 22:03 olivierlambert