lago
lago copied to clipboard
Failed to extract files from a running fc27 VM using libguestfs
Note: In order to see this issue in CI, the Jenkins slace must run fc27 with libvirt.x86_64 3.7.0-4.fc27
It seems that the core of the issue is:
12:24:12 # libguestfs: command: run: \ /home/jenkins/workspace/lago_master_github_check-patch-fc27-x86_64/lago/tests/functional/fixtures/collect/.lago/default/images/vm01_root.qcow2
12:24:12 # qemu-img: Could not open '/home/jenkins/workspace/lago_master_github_check-patch-fc27-x86_64/lago/tests/functional/fixtures/collect/.lago/default/images/vm01_root.qcow2': Failed to get shared "write" lock
12:24:12 # Is another process using the image?
Full context:
23:54 ok 30 collect: setup
12:24:11 not ok 31 collect: does not fail if files to collect don't exist
12:24:11 # (from function `helpers.equals' in file helpers.bash, line 76,
12:24:11 # from function `helpers.run_ok' in file helpers.bash, line 62,
12:24:11 # in test file collect.bats, line 40)
12:24:11 # `helpers.run_ok "$LAGOCLI" --loglevel=debug collect --output "$outdir"' failed
12:24:11 # ~/tests/functional/fixtures/collect ~/tests/functional
12:24:11 # RUNNING:lago --loglevel=debug collect --output /home/jenkins/workspace/lago_master_github_check-patch-fc27-x86_64/lago/tests/functional/fixtures/collect/output
12:24:11 # --output--
12:24:11 # Looking for a workdir
12:24:11 # Checking if /home/jenkins/workspace/lago_master_github_check-patch-fc27-x86_64/lago/tests/functional/fixtures/collect is a workdir
12:24:11 # @ Collect artifacts:
12:24:11 # # numa
12:24:11 # : cpus_per_cell: 1, total_cells: 2
12:24:11 # # numa:
12:24:11 # <numa>
12:24:11 # <cell cpus="0" id="0" memory="32" unit="MiB"/>
12:24:11 # <cell cpus="1" id="1" memory="32" unit="MiB"/>
12:24:11 # </numa>
12:24:11 #
12:24:11 # # [Thread-1] vm01:
12:24:11 # * [Thread-1] :6ab5951b-e811-4882-bad6-64a2529badaa:Get ssh client for vm01::
12:24:11 # - Socket error connecting to vm01: [Errno None] Unable to connect to port 22 on 192.168.200.2
12:24:11 # - Still got 4 tries for vm01
12:24:11 # - Socket error connecting to vm01: [Errno None] Unable to connect to port 22 on 192.168.200.2
12:24:11 # - Still got 3 tries for vm01
12:24:11 # - Socket error connecting to vm01: [Errno None] Unable to connect to port 22 on 192.168.200.2
12:24:11 # - Still got 2 tries for vm01
12:24:11 # - Socket error connecting to vm01: [Errno None] Unable to connect to port 22 on 192.168.200.2
12:24:12 # - Still got 1 tries for vm01
12:24:12 # - Socket error connecting to vm01: [Errno None] Unable to connect to port 22 on 192.168.200.2
12:24:12 # - Still got 0 tries for vm01
12:24:12 # * [Thread-1] :6ab5951b-e811-4882-bad6-64a2529badaa:Get ssh client for vm01:: Success (in 0:00:16)
12:24:12 # * vm01: failed extracting files: Unable to extract paths from vm01: unreachable with SSH
12:24:12 # * vm01: attempting to extract files with libguestfs
12:24:12 # libguestfs: trace: set_verbose true
12:24:12 # libguestfs: trace: set_verbose = 0
12:24:12 # libguestfs: trace: set_tmpdir "/dev/shm"
12:24:12 # libguestfs: trace: set_tmpdir = 0
12:24:12 # libguestfs: trace: set_cachedir "/dev/shm"
12:24:12 # libguestfs: trace: set_cachedir = 0
12:24:12 # libguestfs: trace: set_append "edd=off"
12:24:12 # libguestfs: trace: set_append = 0
12:24:12 # libguestfs: create: flags = 0, handle = 0x7f2614006df0, program = python2
12:24:12 # libguestfs: trace: set_program "lago"
12:24:12 # libguestfs: trace: set_program = 0
12:24:12 # libguestfs: trace: add_drive_ro "/home/jenkins/workspace/lago_master_github_check-patch-fc27-x86_64/lago/tests/functional/fixtures/collect/.lago/default/images/vm01_root.qcow2"
12:24:12 # libguestfs: trace: add_drive "/home/jenkins/workspace/lago_master_github_check-patch-fc27-x86_64/lago/tests/functional/fixtures/collect/.lago/default/images/vm01_root.qcow2" "readonly:true"
12:24:12 # libguestfs: creating COW overlay to protect original drive content
12:24:12 # libguestfs: trace: disk_format "/home/jenkins/workspace/lago_master_github_check-patch-fc27-x86_64/lago/tests/functional/fixtures/collect/.lago/default/images/vm01_root.qcow2"
12:24:12 # libguestfs: command: run: qemu-img
12:24:12 # libguestfs: command: run: \ info
12:24:12 # libguestfs: command: run: \ --output json
12:24:12 # libguestfs: command: run: \ /home/jenkins/workspace/lago_master_github_check-patch-fc27-x86_64/lago/tests/functional/fixtures/collect/.lago/default/images/vm01_root.qcow2
12:24:12 # qemu-img: Could not open '/home/jenkins/workspace/lago_master_github_check-patch-fc27-x86_64/lago/tests/functional/fixtures/collect/.lago/default/images/vm01_root.qcow2': Failed to get shared "write" lock
12:24:12 # Is another process using the image?
12:24:12 # libguestfs: trace: disk_format = NULL (error)
12:24:12 # libguestfs: trace: add_drive = -1 (error)
12:24:12 # libguestfs: trace: add_drive_ro = -1 (error)
12:24:12 # # [Thread-1] vm01: ERROR (in 0:00:16)
12:24:12 # File "/usr/lib/python2.7/site-packages/lago/prefix.py", line 1518, in _collect_artifacts
12:24:12 # vm.collect_artifacts(path, ignore_nopath)
12:24:12 # File "/usr/lib/python2.7/site-packages/lago/plugins/vm.py", line 635, in collect_artifacts
12:24:12 # ignore_nopath=ignore_nopath
12:24:12 # File "/usr/lib/python2.7/site-packages/lago/plugins/vm.py", line 381, in extract_paths
12:24:12 # return self.provider.extract_paths(paths, *args, **kwargs)
12:24:12 # File "/usr/lib/python2.7/site-packages/lago/providers/libvirt/vm.py", line 347, in extract_paths
12:24:12 # self.extract_paths_dead(paths, ignore_nopath)
12:24:12 # File "/usr/lib/python2.7/site-packages/lago/providers/libvirt/vm.py", line 387, in extract_paths_dead
12:24:12 # ignore_nopath=ignore_nopath
12:24:12 # File "/usr/lib/python2.7/site-packages/lago/guestfs_tools.py", line 198, in extract_paths
12:24:12 # with guestfs_conn_mount_ro(disk_path, disk_root) as conn:
12:24:12 # File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__
12:24:12 # return self.gen.next()
12:24:12 # File "/usr/lib/python2.7/site-packages/lago/guestfs_tools.py", line 103, in guestfs_conn_mount_ro
12:24:12 # with guestfs_conn_ro(disk_path) as conn:
12:24:12 # File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__
12:24:12 # return self.gen.next()
12:24:12 # File "/usr/lib/python2.7/site-packages/lago/guestfs_tools.py", line 53, in guestfs_conn_ro
12:24:12 # conn.add_drive_ro(disk_path)
12:24:12 # File "/usr/lib64/python2.7/site-packages/guestfs.py", line 625, in add_drive_ro
12:24:12 # r = libguestfsmod.add_drive_ro(self._o, filename)
Your problem is 'Failed to get shared "write" lock' - a newly introduced qemu-kvm feature. rjones and the virt-v2v team know about it.
See https://www.redhat.com/archives/libvir-list/2017-November/msg00617.html
@gbenhaim so how badly is this breaking things? to we need to make sure we rung an older libvirt on all slaves? We merged a patch this morning that auto-updated libvirt on slaves to always be the latest one on the mirror snapshot attached to them...
A fix to libguest is currently under review https://www.redhat.com/archives/libguestfs/2018-September/msg00070.html