containerized-data-importer icon indicating copy to clipboard operation
containerized-data-importer copied to clipboard

CDI importer converting image which is already RAW

Open waldner opened this issue 1 year ago • 8 comments

What happened:

When creating a DV using an HTTP URL as source to import an image in RAW format, the importer runs a conversion pass on it. The relevant log lines:

...
I1011 13:00:14.111593       1 prometheus.go:78] 99.40
I1011 13:00:15.111696       1 prometheus.go:78] 99.80
I1011 13:00:16.111891       1 prometheus.go:78] 100.00
I1011 13:00:16.638904       1 data-processor.go:247] New phase: Convert
I1011 13:00:16.638960       1 data-processor.go:253] Validating image
E1011 13:00:16.655763       1 prlimit.go:156] failed to kill the process; os: process already finished
I1011 13:00:16.655975       1 qemu.go:354] Adding preallocation method: [-o preallocation=falloc]
I1011 13:00:16.656036       1 qemu.go:358] Attempting preallocation method, qemu-img args: [convert -o preallocation=falloc -t writeback -p -O raw /scratch/tmpimage /data/disk.img]
I1011 13:00:16.668451       1 qemu.go:273] 0.00
I1011 13:00:18.530061       1 qemu.go:273] 1.00
...

As said, the image is already RAW; the import over http is already extremely slow despite the http server being located on the same lan; it'd be better not to add yet even more extra time for a no-op conversion to the VM start time.

What you expected to happen:

The mage should be used as-is, or there should be a way to tell the importer to skip the (no-op) conversion.

Additional context:

Here's the DV template from the VM definition:

    dataVolumeTemplates:
    - metadata:
        creationTimestamp: null
        name: myvm-cdisk
      spec:
        preallocation: true
        pvc:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 50Gi
        source:
          http:
            url: http://10.11.12.10:9999/assets/rawimg.img

Environment:

  • CDI version (use kubectl get deployments cdi-deployment -o yaml): v1.60.3
  • Kubernetes version (use kubectl version): 1.31.1
  • DV specification: see above

waldner avatar Oct 11 '24 13:10 waldner

If the source is already RAW, then the conversion is essentially a no-op, and it is just copying the data.

awels avatar Oct 11 '24 13:10 awels

Sure, but it still wastes time before the VM can start. Since the initial image is rather large, the (already extremely slow) process of importing it becomes even slower. Also I might be misremembering, but I think this was not happening in older versions.

waldner avatar Oct 11 '24 13:10 waldner

In fact, why is a conversion needed at all? Qemu and libvirt can work with different image types, so even if the imported image were (say) QCOW2 it could still be used as-is. What is the rationale behind the conversion process? I think the user should be given more control over it.

waldner avatar Oct 14 '24 15:10 waldner

So we do the conversion because we want all the images to be raw. This allows us to directly use built in kubernetes APIs and worry about different formats. This also allows us to use the built in capabilities of the kubernets platform (snapshots, etc). We could use qcow2 format most likely, but now we need a different approach when we want to use that with block devices. It just makes everything more complex. This approach has served us really well and we are unlikely to diverge.

awels avatar Oct 14 '24 19:10 awels

thanks for the explanation; still, I maintain that a raw-to-raw conversion is unneeded and can delay the VM startup by a few minutes if the image is big. I don't know the storage details, but it might also waste space if both images need to be stored at the same time, at least for the duration of the conversion (I might be wrong on this last one though).

waldner avatar Oct 14 '24 19:10 waldner

I agree with you, I just double checked the code as I was like, there is no reason to not write directly to the target once we determine this is a raw file. But in the code we write to scratch space first and then write to the target. I have to go look up the history of why it is like this. There might be some validation on sizes and stuff that we can only get from writing to scratch space first and then copying the data.

awels avatar Oct 14 '24 20:10 awels

One reason we do the conversion unconditionally is so that we can respect the sparseness of the image on the target PVC. In the past we did not have a sparse writer and the so the only way to do this was with explicit conversion. Now @mhenriks has provided a sparse writer so it could be possible to write directly to the target while respecting sparseness. Alvaro is going to take a look and see if this can be done...

/assign @alromeros

aglitke avatar Oct 21 '24 12:10 aglitke

Thanks. Either way, I would give the user more power to decide, if they want to trade optimizations for VM startup speed, they should be able to do it.

waldner avatar Oct 21 '24 22:10 waldner

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

kubevirt-bot avatar Jan 19 '25 23:01 kubevirt-bot

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten

kubevirt-bot avatar Feb 18 '25 23:02 kubevirt-bot

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

/close

kubevirt-bot avatar Mar 21 '25 00:03 kubevirt-bot

@kubevirt-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

kubevirt-bot avatar Mar 21 '25 00:03 kubevirt-bot

/reopen

QcFe avatar Apr 03 '25 17:04 QcFe

@QcFe: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

kubevirt-bot avatar Apr 03 '25 17:04 kubevirt-bot

/reopen

mhenriks avatar Apr 03 '25 18:04 mhenriks

@mhenriks: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

kubevirt-bot avatar Apr 03 '25 18:04 kubevirt-bot

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

/close

kubevirt-bot avatar May 03 '25 19:05 kubevirt-bot

@kubevirt-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

kubevirt-bot avatar May 03 '25 19:05 kubevirt-bot