Host assisted clone should adjust requested storage when cloning from block to filesystem volume mode (#3900)
When cloning from a source, the host cloner should check the requested size for the target tmp pvc if the tmp source pvc is of block volume mode and the target pvc is of filesystem volume mode because it should take filesystem overhead into consideration and enlarge the original request size accordingly before creating the tmp pvc.
Because currently the check is missing the cloning will fail on validation.
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #
https://issues.redhat.com/browse/CNV-44140
Special notes for your reviewer:
Release note:
Fix: Host assisted clone should adjust requested storage when cloning from block to filesystem volume mode
Hi @gaohoward. Thanks for your PR.
PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.
I understand the commands that are listed here.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign aglitke for approval. For more information see the Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
/test all
coverage: 58.65% (+0.003%) from 58.647% when pulling 86651108a33762b1234a0a9230b1ee647e49a32f on gaohoward:cnv-44140 into 094020bca66a12babc8cfac0d303963f541a40c8 on kubevirt:main.
I am probably missing something, but didn't https://github.com/kubevirt/containerized-data-importer/pull/3384 fix this particular issue? Or did I miss a particular case with that PR?
I am probably missing something, but didn't #3384 fix this particular issue? Or did I miss a particular case with that PR?
Yeah I think it goes a different path. In this case it falls back to host-clone from smart-clone (supported by provisioner), and in that case it didn't take the fs overhead into account.
Gotcha, okay, let me know if you need a review on the PR.
Gotcha, okay, let me know if you need a review on the PR.
Yes pleas review. Thanks @awels !
/retest
/retest
/retest
/retest
/retest
@awels Hi Alexander, I finally get the tests pass, can you please take a look again?
Thanks
/retest
/retest
/retest
/retest
/retest
Thank you for tackling this! I would appreciate a link to the bug so I can do some testing and see the origin of the issue.
I am a little confused that we have to inflate in the populator layer; usually we do this in the mutating pvc webhook (on by default)/dv controllers
Here you are: https://issues.redhat.com/browse/CNV-44140
Yes the issue happens at the very late stage where the snapshot clone falls back to host assisted clone.
Thank you for tackling this! I would appreciate a link to the bug so I can do some testing and see the origin of the issue. I am a little confused that we have to inflate in the populator layer; usually we do this in the mutating pvc webhook (on by default)/dv controllers
Here you are: https://issues.redhat.com/browse/CNV-44140
Yes the issue happens at the very late stage where the snapshot clone falls back to host assisted clone.
I see, though I am still confused by inflating this far into the process. Doesn't the target PVC get inflated anyhow by the webhook/DV controller (and thus so does the temp)?
Also note that bug describes a separate issue that may still exist
$ oc create -f dv-hpp-cloned-from-datasource.yaml
Error from server: error when creating "dv-hpp-cloned-from-datasource.yaml": admission webhook "datavolume-validate.cdi.kubevirt.io" denied the request: Storage size is missing
/cc @jpeimer
@akalenyu: GitHub didn't allow me to request PR reviews from the following users: jpeimer.
Note that only kubevirt members and repo collaborators can review this PR, and authors cannot review their own PRs.
In response to this:
/cc @jpeimer
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
Thank you for tackling this! I would appreciate a link to the bug so I can do some testing and see the origin of the issue. I am a little confused that we have to inflate in the populator layer; usually we do this in the mutating pvc webhook (on by default)/dv controllers
Here you are: https://issues.redhat.com/browse/CNV-44140 Yes the issue happens at the very late stage where the snapshot clone falls back to host assisted clone.
I see, though I am still confused by inflating this far into the process. Doesn't the target PVC get inflated anyhow by the webhook/DV controller (and thus so does the temp)?
No the target PVC doesn't get inflated when doing snapshot clone. When the controller plans the clone it finds out the storage class is different from the volume snapshot's one and then it fall back to host assisted clone. When doing host clone phase it create the tmp target PVC, if we don't check here then it will report the failure because the target size is not enough (for this to happen the tmp target PVC volume mode must be filesystem).
dv-hpp-cloned-from-datasource.yaml
It doesn't happen for me on upstream. Can you share your yaml file?
dv-hpp-cloned-from-datasource.yaml
It doesn't happen for me on upstream. Can you share your yaml file?
Yeah I don't see it either, may have to do with an older version
Thank you for tackling this! I would appreciate a link to the bug so I can do some testing and see the origin of the issue. I am a little confused that we have to inflate in the populator layer; usually we do this in the mutating pvc webhook (on by default)/dv controllers
Here you are: https://issues.redhat.com/browse/CNV-44140 Yes the issue happens at the very late stage where the snapshot clone falls back to host assisted clone.
I see, though I am still confused by inflating this far into the process. Doesn't the target PVC get inflated anyhow by the webhook/DV controller (and thus so does the temp)?
No the target PVC doesn't get inflated when doing snapshot clone. When the controller plans the clone it finds out the storage class is different from the volume snapshot's one and then it fall back to host assisted clone. When doing host clone phase it create the tmp target PVC, if we don't check here then it will report the failure because the target size is not enough (for this to happen the tmp target PVC volume mode must be filesystem).
I was pretty sure we inflate filesystem target PVCs here - https://github.com/kubevirt/containerized-data-importer/blob/e147bbdf17252d9bd3b29738cf12c0734c9e6884/pkg/controller/datavolume/util.go#L314 Sorry for poking I just think we have a problem elsewhere and that inflating the temp PVC in the populator will just cover that up
dv-hpp-cloned-from-datasource.yaml
It doesn't happen for me on upstream. Can you share your yaml file?
Yeah I don't see it either, may have to do with an older version
Thank you for tackling this! I would appreciate a link to the bug so I can do some testing and see the origin of the issue. I am a little confused that we have to inflate in the populator layer; usually we do this in the mutating pvc webhook (on by default)/dv controllers
Here you are: https://issues.redhat.com/browse/CNV-44140 Yes the issue happens at the very late stage where the snapshot clone falls back to host assisted clone.
I see, though I am still confused by inflating this far into the process. Doesn't the target PVC get inflated anyhow by the webhook/DV controller (and thus so does the temp)?
No the target PVC doesn't get inflated when doing snapshot clone. When the controller plans the clone it finds out the storage class is different from the volume snapshot's one and then it fall back to host assisted clone. When doing host clone phase it create the tmp target PVC, if we don't check here then it will report the failure because the target size is not enough (for this to happen the tmp target PVC volume mode must be filesystem).
I was pretty sure we inflate filesystem target PVCs here -
https://github.com/kubevirt/containerized-data-importer/blob/e147bbdf17252d9bd3b29738cf12c0734c9e6884/pkg/controller/datavolume/util.go#L314 Sorry for poking I just think we have a problem elsewhere and that inflating the target in the populator will just cover that up
This method is never called in this case. Actually the webhook rendering is not enabled. For snapshot cloning it basically fallback to host-assisted during the planning phase, and when the host-clone phase is handling the temp target pvc creation, it has a chance to make sure the target size is right (inflate if necessary).
dv-hpp-cloned-from-datasource.yaml
It doesn't happen for me on upstream. Can you share your yaml file?
Yeah I don't see it either, may have to do with an older version
Thank you for tackling this! I would appreciate a link to the bug so I can do some testing and see the origin of the issue. I am a little confused that we have to inflate in the populator layer; usually we do this in the mutating pvc webhook (on by default)/dv controllers
Here you are: https://issues.redhat.com/browse/CNV-44140 Yes the issue happens at the very late stage where the snapshot clone falls back to host assisted clone.
I see, though I am still confused by inflating this far into the process. Doesn't the target PVC get inflated anyhow by the webhook/DV controller (and thus so does the temp)?
No the target PVC doesn't get inflated when doing snapshot clone. When the controller plans the clone it finds out the storage class is different from the volume snapshot's one and then it fall back to host assisted clone. When doing host clone phase it create the tmp target PVC, if we don't check here then it will report the failure because the target size is not enough (for this to happen the tmp target PVC volume mode must be filesystem).
I was pretty sure we inflate filesystem target PVCs here - https://github.com/kubevirt/containerized-data-importer/blob/e147bbdf17252d9bd3b29738cf12c0734c9e6884/pkg/controller/datavolume/util.go#L314
Sorry for poking I just think we have a problem elsewhere and that inflating the target in the populator will just cover that up
This method is never called in this case. Actually the webhook rendering is not enabled. For snapshot cloning it basically fallback to host-assisted during the planning phase, and when the host-clone phase is handling the temp target pvc creation, it has a chance to make sure the target size is right (inflate if necessary).
Webhook rendering is enabled since a few releases now, but anyway renderPvcSpecVolumeSize existed prior to it for the DV storage API. How come the target doesn't get inflated?
I think the real issue here is that we don't inflate the PVC during mutation if it has a data source: https://github.com/kubevirt/containerized-data-importer/blob/e1bfde033dc8c83ecdbfea0e555f35f42704c03a/pkg/controller/datavolume/util.go#L65-L88
https://github.com/kubevirt/containerized-data-importer/blob/e1bfde033dc8c83ecdbfea0e555f35f42704c03a/pkg/controller/datavolume/util.go#L246-L255
We simply set the restore size if it's available and leave it as is otherwise.