containerized-data-importer
containerized-data-importer copied to clipboard
CDI importer converting image which is already RAW
What happened:
When creating a DV using an HTTP URL as source to import an image in RAW format, the importer runs a conversion pass on it. The relevant log lines:
...
I1011 13:00:14.111593 1 prometheus.go:78] 99.40
I1011 13:00:15.111696 1 prometheus.go:78] 99.80
I1011 13:00:16.111891 1 prometheus.go:78] 100.00
I1011 13:00:16.638904 1 data-processor.go:247] New phase: Convert
I1011 13:00:16.638960 1 data-processor.go:253] Validating image
E1011 13:00:16.655763 1 prlimit.go:156] failed to kill the process; os: process already finished
I1011 13:00:16.655975 1 qemu.go:354] Adding preallocation method: [-o preallocation=falloc]
I1011 13:00:16.656036 1 qemu.go:358] Attempting preallocation method, qemu-img args: [convert -o preallocation=falloc -t writeback -p -O raw /scratch/tmpimage /data/disk.img]
I1011 13:00:16.668451 1 qemu.go:273] 0.00
I1011 13:00:18.530061 1 qemu.go:273] 1.00
...
As said, the image is already RAW; the import over http is already extremely slow despite the http server being located on the same lan; it'd be better not to add yet even more extra time for a no-op conversion to the VM start time.
What you expected to happen:
The mage should be used as-is, or there should be a way to tell the importer to skip the (no-op) conversion.
Additional context:
Here's the DV template from the VM definition:
dataVolumeTemplates:
- metadata:
creationTimestamp: null
name: myvm-cdisk
spec:
preallocation: true
pvc:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
source:
http:
url: http://10.11.12.10:9999/assets/rawimg.img
Environment:
- CDI version (use
kubectl get deployments cdi-deployment -o yaml): v1.60.3 - Kubernetes version (use
kubectl version): 1.31.1 - DV specification: see above
If the source is already RAW, then the conversion is essentially a no-op, and it is just copying the data.
Sure, but it still wastes time before the VM can start. Since the initial image is rather large, the (already extremely slow) process of importing it becomes even slower. Also I might be misremembering, but I think this was not happening in older versions.
In fact, why is a conversion needed at all? Qemu and libvirt can work with different image types, so even if the imported image were (say) QCOW2 it could still be used as-is. What is the rationale behind the conversion process? I think the user should be given more control over it.
So we do the conversion because we want all the images to be raw. This allows us to directly use built in kubernetes APIs and worry about different formats. This also allows us to use the built in capabilities of the kubernets platform (snapshots, etc). We could use qcow2 format most likely, but now we need a different approach when we want to use that with block devices. It just makes everything more complex. This approach has served us really well and we are unlikely to diverge.
thanks for the explanation; still, I maintain that a raw-to-raw conversion is unneeded and can delay the VM startup by a few minutes if the image is big. I don't know the storage details, but it might also waste space if both images need to be stored at the same time, at least for the duration of the conversion (I might be wrong on this last one though).
I agree with you, I just double checked the code as I was like, there is no reason to not write directly to the target once we determine this is a raw file. But in the code we write to scratch space first and then write to the target. I have to go look up the history of why it is like this. There might be some validation on sizes and stuff that we can only get from writing to scratch space first and then copying the data.
One reason we do the conversion unconditionally is so that we can respect the sparseness of the image on the target PVC. In the past we did not have a sparse writer and the so the only way to do this was with explicit conversion. Now @mhenriks has provided a sparse writer so it could be possible to write directly to the target while respecting sparseness. Alvaro is going to take a look and see if this can be done...
/assign @alromeros
Thanks. Either way, I would give the user more power to decide, if they want to trade optimizations for VM startup speed, they should be able to do it.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
/lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
/close
@kubevirt-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity. Reopen the issue with
/reopen. Mark the issue as fresh with/remove-lifecycle rotten./close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
/reopen
@QcFe: You can't reopen an issue/PR unless you authored it or you are a collaborator.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
/reopen
@mhenriks: Reopened this issue.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
/close
@kubevirt-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity. Reopen the issue with
/reopen. Mark the issue as fresh with/remove-lifecycle rotten./close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.