containerized-data-importer icon indicating copy to clipboard operation
containerized-data-importer copied to clipboard

Add expected_hash to DataVolume

Open dankenigsberg opened this issue 4 years ago • 19 comments
trafficstars

/kind enhancement

qemu images may be garbled en route from an http server to CDI's target. To identify this condition, many servers (e.g. Fedora provide the expected checksum of the image.

I would like to state the expected hash in the DataVolume spec, and have the import fail if any of the provided hashes does not match the imported data.

apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  name: "example-import-dv"
spec:
  source:
      http:
         url: "http://mirror.isoc.org.il/pub/fedora/releases/33/Cloud/x86_64/images/Fedora-Cloud-Base-33-1.2.x86_64.raw.xz"
      hashes:
         sha256: 35fa778f5d4830b58f7baf121fff6bd2b52500411c9abc46761b29a690415c3f
         length: 203308980

dankenigsberg avatar Dec 08 '20 15:12 dankenigsberg

This stackoverflow answer may provide an elegant way to integrate hashing into our stream readers.

aglitke avatar Dec 14 '20 21:12 aglitke

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

kubevirt-bot avatar Mar 14 '21 21:03 kubevirt-bot

/remove-lifecycle stale

dankenigsberg avatar Mar 15 '21 04:03 dankenigsberg

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

kubevirt-bot avatar Jun 13 '21 04:06 kubevirt-bot

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

kubevirt-bot avatar Sep 11 '21 05:09 kubevirt-bot

/remove-lifecycle stale

@mhenriks would you express here why this is nontrivial to implement?

dankenigsberg avatar Sep 12 '21 09:09 dankenigsberg

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

kubevirt-bot avatar Dec 11 '21 10:12 kubevirt-bot

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten

kubevirt-bot avatar Jan 10 '22 10:01 kubevirt-bot

/remove-lifecycle rotten

dankenigsberg avatar Jan 10 '22 10:01 dankenigsberg

/lifecycle frozen

I want this too :)

I think this is essential. For containerDisks this is kind of built-in (at least I hope that skopeo does these checks).

rmohr avatar Jan 13 '22 16:01 rmohr

This stackoverflow answer may provide an elegant way to integrate hashing into our stream readers.

Can confirm that this pattern works great. We use it for instance here: https://github.com/kubevirt/containerdisks/blob/main/pkg/http/http.go#L38

rmohr avatar Jan 13 '22 16:01 rmohr

@rmohr @dankenigsberg

Can confirm that this pattern works great. We use it for instance here: https://github.com/kubevirt/containerdisks/blob/main/pkg/http/http.go#L38

IMO it is not secure to compute the checksum at t0 and assume it is the same at t1. Especially with http (no s) url.

The only truly secure way to do this is to download the file to scratch space, compute the checksum (can be done while downloading), then use downloaded file. And this is definitely something we can do but will make certain operations slower (http(s) qcow2 or raw)

mhenriks avatar Jan 24 '22 14:01 mhenriks

IMO it is not secure to compute the checksum at t0 and assume it is the same at t1. Especially with http (no s) url.

Could you explain what you mean? What is the difference to computing the checksum during the download (stream) compared to first downloading and then computing it? I am not suggesting to first compute the checksum from the remote location and then download the file.

Edit: Oh you are probably talking about directly pushing and releasing. while it streams through. Yes, right. If the target has no toggle which can be triggered after the import to make it usable, then yes definitely.

rmohr avatar Jan 24 '22 15:01 rmohr

Edit: Oh you are probably talking about directly pushing and releasing. while it streams through. Yes, right. If the target has no toggle which can be triggered after the import to make it usable, then yes definitely.

Yeah, for example we can currently directly stream a qcow2 file to raw via http. This requires no scratch space. To validate the qcow2 checksum will require downloading the qcow2 file to scratch space.

mhenriks avatar Jan 24 '22 16:01 mhenriks

Yeah, for example we can currently directly stream a qcow2 file to raw via http. This requires no scratch space. To validate the qcow2 checksum will require downloading the qcow2 file to scratch space.

I am not sure I understand this example? Are you talking about cases where you internally use a tool which requires a http source and does the download itself? Otherwise it seems to me like you could always calculate the checksum while you convert with a tee reader, or not?

You could still fail the import after the full download where you finally know that the checksum is bad.

rmohr avatar Jan 24 '22 16:01 rmohr

I am not sure I understand this example? Are you talking about cases where you internally use a tool which requires a http source and does the download itself? Otherwise it seems to me like you could always calculate the checksum while you convert with a tee reader, or not?

nbdkit is used to give CDI one interface to a bunch of files/formats on the other end of a url. My assumption is that when checksum is provided CDI will first download/validate the file then point nbdkit to local file rather than http.

mhenriks avatar Jan 24 '22 16:01 mhenriks

nbdkit is used to give CDI one interface to a bunch of files/formats on the other end of a url. My assumption is that when checksum is provided CDI will first download/validate the file then point nbdkit to local file rather than http.

In this case I would probably aim for providing sockets or file descriptors (e.g. pipes) to nbdkit (unless ndbkit supports checksums directly).

rmohr avatar Jan 24 '22 16:01 rmohr

In this case I would probably aim for providing sockets or file descriptors to nbdkit (unless ndbkit supports checksums directly).

If nbdkit does not access the file sequentially I don't think that we will be able to efficiently compute checksums. And I'm pretty sure it does not access qcow sequentially

mhenriks avatar Jan 24 '22 16:01 mhenriks

If nbdkit does not access the file sequentially I don't think that we will be able to efficiently compute checksums. And I'm pretty sure it does not access qcow sequentially

If this is needed, definitely :)

rmohr avatar Jan 24 '22 16:01 rmohr

Will a Merkle Hash be better than a linear hash? See for example https://listman.redhat.com/archives/libguestfs/2023-February/030908.html which describes how blkhash can come up with faster hash results by using a Merkle Tree

ebblake avatar Mar 01 '23 17:03 ebblake

Hey @mhenriks, do you think this would make sense in the context of populators?

alromeros avatar Aug 02 '23 13:08 alromeros

Created a Jira card to track this issue https://issues.redhat.com/browse/CNV-31631. Since a single VolumeImportSource can be used for several PVCs I think this might be more useful for populators.

alromeros avatar Aug 02 '23 14:08 alromeros

Hey @dankenigsberg, following some discussions with the team we've concluded that this feature would require a significant divergence from our import flow (https://github.com/kubevirt/containerized-data-importer/issues/1520#issuecomment-1020189814), which seems detrimental for most use cases and overkill for a low priority feature. I'm closing this issue and moving the story to the backlog, but feel free to reopen if necessary. Thanks!

alromeros avatar Oct 23 '23 09:10 alromeros