lxd Allow instance import from QCoW2 and VMDK format

This is first part of the Improve VM import from external sources.

It introduces the new migration source type conversion, which ensures the received image is converted using qemu-img into the raw image format.

All virt-v2v bits are stripped away and will be introduced in the second part where we will also introduce conversion option virtio which will run virt-v2v-in-place on converted image to ensure virtio drivers are present.

TODO:

[x] qemu-img info is not provided with image format explicitly (Comment on spec)
[x] conversionOpts: --conversion=format (list of conversion options)
[x] Add tests for conversion (https://github.com/canonical/lxd-ci/pull/179)

Apr 25 '24 11:04 MusicDin

Heads up @ru-fu - the "Documentation" label was applied to this issue.

Apr 25 '24 11:04 github-actions[bot]

For tests we could use VMDK images provided on ubuntu cloud (https://cloud-images.ubuntu.com/releases/22.04/release-20240514). They are a bit bulky (~600MiB), but offer a reliable way to get images without the need to prepare them during the tests or store pre-build image within a repository.

May 22 '24 09:05 MusicDin

For tests we could use VMDK images provided on ubuntu cloud (https://cloud-images.ubuntu.com/releases/22.04/release-20240514). They are a bit bulky (~600MiB), but offer a reliable way to get images without the need to prepare them during the tests or store pre-build image within a repository.

Sounds good, thanks.

May 22 '24 09:05 tomponline

Are there ubuntu-minimal variants we can use that are smaller?

May 22 '24 09:05 tomponline

I couldn't find one, but it would very practical to use minimal images.

May 22 '24 11:05 MusicDin

Its somewhat concerning me that this implementation for image conversion has been implemented inside the migration functions.

Per suggestion, I've separated the logic out into the CreateInstanceFromConversion which works both straight "raw" import (--conversion=) and when formatting is needed (--conversion=format).

I think this will reduce the complexity of migration logic, especially when considering that conversion logic will be further expanded soon (virt-v2v).

Jun 05 '24 19:06 MusicDin

Converted back to draft based on our chat today. Thanks

Jun 10 '24 15:06 tomponline

Needs a rebase please

Jun 12 '24 19:06 tomponline

To bring the context, there were 3 issues:

ImageUploadPath was passed to the storage subsystem, which determined whether the image should be uploaded into backups directory or instance's volume. Now, the image is downloaded before the instance volume is created and conversion logic is passed as a filler to CreateVolume function.
ConversionOptions were passed around as a pointer instead of a nil slice.
Why are migration sink arguments public if structure is private? -> https://github.com/canonical/lxd/pull/13598

Jun 13 '24 08:06 MusicDin

@tomponline This is ready for another review.

Jun 13 '24 09:06 MusicDin

The idea was that if format is not enabled, the logic remains unchanged compared to how previous migration worked.

However, we can say that if server supports conversion and instance type is virtual-machine skip the filesystem. As through the lxd-migrate the VM's filesystem is never sent, but we need to keep the old migration in case the server does not support conversion.

Jul 04 '24 13:07 MusicDin

However, we can say that if server supports conversion and instance type is virtual-machine skip the filesystem. As through the lxd-migrate the VM's filesystem is never sent, but we need to keep the old migration in case the server does not support conversion.

I think my expectation is that if the server has the conversion then its used for both VMs and containers, as its simpler and avoids needing the migration API to evolve to support conversions.

Jul 04 '24 14:07 tomponline

I think my expectation is that if the server has the conversion then its used for both VMs and containers, as its simpler and avoids needing the migration API to evolve to support conversions.

Exactly, this is how it is currently implemented.

However, we can say that if server supports conversion and instance type is virtual-machine skip the filesystem. As through the lxd-migrate the VM's filesystem is never sent, but we need to keep the old migration in case the server does not support conversion.

Sorry, I may have confused you, but this was referring to skipping the unnecessary extra connection when we are importing virtual machines, because there is no filesystem to be transfered. However, if server does not support conversion we need to do that unnecessary connection to ensure backward compatibility.

Jul 04 '24 14:07 MusicDin

@tomponline Regarding the VolumeSize that is offered in a header when transferring a VM image. The VolumeSize that is offered represents the image file size. The offered volume size is used to ensure the (raw) image fits within the volume if the default pool size is not big enough.  

The header is offered during the filesystem transfer, which we currently skip if the conversion option format is enabled. This seems sensible, because the file size of image formats like QCow2 or VMDK does not correspond with the actual size of decompressed image, so we cannot ensure this is the "final" size. However, if we detect that the provided image is raw (in lxd-migrate tool) the conversion option format is automatically disabled - meaning the header is still offered. 

With that being said, do we still want to proceed with skipping the filesystem transfer in case the server supports conversion mode and instance type is VM?

Or, do we want to also offer the volume size as header in case format options is enabled? - In such case I think it does not make sense, because when the image is uploaded to the backups directory, we extract the actual size of decompressed file which we can use for the volume size.

Any thoughts?

Jul 09 '24 08:07 MusicDin

Checking the file outputs to ensure we properly catch the raw disk format and still have to check that partition/disk size is properly extracted.

Jul 15 '24 15:07 MusicDin

@tomponline

As far as I can tell, the file check for raw should be sufficient, but we can tune that later if there will be any edge case, as it is primarily used to skip unnecessary formatting (it is also used to prevent incorrect disk path input when mode is migration).

This is ready for review, except for linters in shared/util_linux.go which I'm currently looking into.

Jul 15 '24 16:07 MusicDin

And CodeQL is also complaining..

Jul 15 '24 16:07 MusicDin

There are two linter issues that I'm not sure about:

Error is not the last return argument. While this is true, this one feels intuitive as is. Similar to assertions (str, ok := x.(string))

shared/util_linux.go:175:1: error-return: error should be the last type when returning multiple items (revive)
func GetErrno(err error) (errno error, iserrno bool) {
        sysErr, ok := err.(*os.SyscallError)
        if ok {
                return sysErr.Err, true
        }

        pathErr, ok := err.(*os.PathError)
        if ok {
                return pathErr.Err, true
        }

        tmpErrno, ok := err.(unix.Errno)
        if ok {
                return tmpErrno, true
        }

        return nil, false
}

Another one complains about defers in loops.

lxd/storage/drivers/driver_btrfs_volumes.go:1296:6: defer: prefer not to defer inside loops (revive)
                                        defer func() { _ = d.setSubvolumeReadonlyProperty(parentPath, false) }()
                                        ^
lxd/storage/drivers/driver_btrfs_volumes.go:1308:5: defer: prefer not to defer inside loops (revive)
                                defer func() { _ = d.setSubvolumeReadonlyProperty(sourcePath, false) }()
                                ^
lxd/storage/drivers/driver_btrfs_volumes.go:1531:6: defer: prefer not to defer inside loops (revive)
                                        defer func() { _ = d.setSubvolumeReadonlyProperty(parentPath, false) }()

Jul 15 '24 18:07 MusicDin

Error is not the last return argument. While this is true, this one feels intuitive as is. Similar to assertions (str, ok := x.(string))

Yeah you can add an ignore for that one

Jul 16 '24 07:07 tomponline

Another one complains about defers in loops.

This seems fine to ignore as the snapshots will have their ro property set back once the migration has finished (or failed).

Jul 16 '24 07:07 tomponline

And CodeQL is also complaining..

We can ignore these for this PR as they are not new issues and are to do with validation improvements needed in other API endpoints which will be tracked via the Security tab.

Jul 16 '24 07:07 tomponline

As far as I can tell, the file check for raw should be sufficient, but we can tune that later if there will be any edge case, as it is primarily used to skip unnecessary formatting (it is also used to prevent incorrect disk path input when mode is migration).

Thanks, makes sense

Jul 16 '24 07:07 tomponline