kubevirt backend-storage: add RWO FS support

What this PR does

See https://github.com/kubevirt/community/pull/314

Before this PR:

After this PR:

Fixes #

Why we need it and why it was done in this way

The following tradeoffs were made:

The following alternatives were considered:

Links to places where the discussion took place:

Special notes for your reviewer

Checklist

This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR. Approvers are expected to review this list.

[ ] Design: A design document was considered and is present (link) or not required
[ ] PR: The PR description is expressive enough and will help future contributors
[ ] Code: Write code that humans can understand and Keep it simple
[ ] Refactor: You have left the code cleaner than you found it (Boy Scout Rule)
[ ] Upgrade: Impact of this change on upgrade flows was considered and addressed if required
[ ] Testing: New code requires new unit tests. New features and bug fixes require at least on e2e test
[ ] Documentation: A user-guide update was considered and is present (link) or not required. You want a user-guide update if it's a user facing feature / API change.
[ ] Community: Announcement to kubevirt-dev was considered

Release note

backend-storage now supports RWO FS

Aug 19 '24 15:08 jean-edouard

Skipping CI for Draft Pull Request. If you want CI signal for your change, please convert it to an actual PR. You can still manually trigger a test run with /test all

Aug 19 '24 15:08 kubevirt-bot

/test all

Aug 19 '24 15:08 jean-edouard

/test all

Aug 19 '24 17:08 jean-edouard

/test all

Aug 19 '24 20:08 jean-edouard

/test all

Aug 26 '24 18:08 jean-edouard

/test all

Aug 27 '24 13:08 jean-edouard

/cc @acardace

Sep 12 '24 13:09 jean-edouard

/retest

Sep 17 '24 12:09 jean-edouard

/retest

Oct 02 '24 21:10 jean-edouard

@jean-edouard Left another review, still some things to adjust but I think we're almost there!

Oct 04 '24 10:10 acardace

@dhiller is there a way to mark tests as incompatible with the flake-checker? I duplicated 4 test into sig-compute to test behavior when NFS is absent. They fail on the flake checker since it has NFS... Thanks!

Oct 08 '24 00:10 jean-edouard

/approve Thank you @jean-edouard !

Oct 16 '24 07:10 acardace

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: acardace

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [acardace]

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

Oct 16 '24 07:10 kubevirt-bot

/test all

Oct 18 '24 07:10 acardace

@jean-edouard kudos for the great idea to add a separate label for flake-check compatibility.

I figure we need to add the label here: https://github.com/kubevirt/kubevirt/blob/main/automation/repeated_test.sh#L199

Would you be up to creating a follow up PR marking all the other tests that have been filtered out in other ways? (Since I just came back I might have that in my emails - apologies for pinging you in that case)

Oct 21 '24 11:10 dhiller

@jean-edouard kudos for the great idea to add a separate label for flake-check compatibility.

I figure we need to add the label here: https://github.com/kubevirt/kubevirt/blob/main/automation/repeated_test.sh#L199

Isn't that what I already do? :)

Would you be up to creating a follow up PR marking all the other tests that have been filtered out in other ways? (Since I just came back I might have that in my emails - apologies for pinging you in that case)

Sure, do you have any pointers?

Oct 21 '24 12:10 jean-edouard

/test pull-kubevirt-check-tests-for-flakes

Oct 22 '24 07:10 fossedihelm

@jean-edouard kudos for the great idea to add a separate label for flake-check compatibility. I figure we need to add the label here: https://github.com/kubevirt/kubevirt/blob/main/automation/repeated_test.sh#L199

Isn't that what I already do? :)

Yeah - glancing at this PR I just missed it, you are right :blush:

Would you be up to creating a follow up PR marking all the other tests that have been filtered out in other ways? (Since I just came back I might have that in my emails - apologies for pinging you in that case)

Sure, do you have any pointers?

I was thinking about what is currently achieved with the exclude-native-ssh label but maybe @0xFelix has a better POV here about whether this makes sense. It wouldn't hurt to broadcast about this new label anyway, so that everyone knows, by i.e. adding a paragraph here : https://github.com/kubevirt/kubevirt/blob/91e5d7899442f97b2635c82bd0d446c92969dc99/tests/README.md?plain=1#L3

Oct 22 '24 10:10 dhiller

Thank you for the patience @jean-edouard /lgtm

Oct 22 '24 12:10 fossedihelm

Required labels detected, running phase 2 presubmits: /test pull-kubevirt-e2e-windows2016 /test pull-kubevirt-e2e-kind-1.30-vgpu /test pull-kubevirt-e2e-kind-sriov /test pull-kubevirt-e2e-k8s-1.30-ipv6-sig-network /test pull-kubevirt-e2e-k8s-1.29-sig-network /test pull-kubevirt-e2e-k8s-1.29-sig-storage /test pull-kubevirt-e2e-k8s-1.29-sig-compute /test pull-kubevirt-e2e-k8s-1.29-sig-operator /test pull-kubevirt-e2e-k8s-1.30-sig-network /test pull-kubevirt-e2e-k8s-1.30-sig-storage /test pull-kubevirt-e2e-k8s-1.30-sig-compute /test pull-kubevirt-e2e-k8s-1.30-sig-operator

Oct 22 '24 12:10 kubevirt-commenter-bot

/retest-required

Oct 22 '24 17:10 stu-gott

@jean-edouard: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-kubevirt-e2e-k8s-1.30-sig-storage	60b5db4bd102f27059a24a25ee7f8423bd3f98ab	link	unknown	`/test pull-kubevirt-e2e-k8s-1.30-sig-storage`
pull-kubevirt-e2e-k8s-1.31-sig-compute-migrations	60b5db4bd102f27059a24a25ee7f8423bd3f98ab	link	true	`/test pull-kubevirt-e2e-k8s-1.31-sig-compute-migrations`
pull-kubevirt-e2e-k8s-1.31-sig-network	60b5db4bd102f27059a24a25ee7f8423bd3f98ab	link	true	`/test pull-kubevirt-e2e-k8s-1.31-sig-network`

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Oct 22 '24 19:10 kubevirt-bot

/test pull-kubevirt-e2e-k8s-1.31-sig-network /test pull-kubevirt-e2e-k8s-1.31-sig-compute-migrations

Oct 22 '24 19:10 alromeros

@jean-edouard The cache fixes are required in previous versions

Nov 01 '24 14:11 xpivarc

@jean-edouard @acardace I noticed this PR only now, I didn't expect this would be merged if there are so many disagreements on the proposal! This PR is also affected by the race mentioned in the proposal (https://issues.redhat.com/browse/CNV-50720) and https://github.com/kubevirt/kubevirt/issues/13112

Nov 05 '24 14:11 vladikr

@jean-edouard could you please open an upstream issue for https://issues.redhat.com/browse/CNV-50720

Nov 05 '24 14:11 vladikr

@jean-edouard could you please open an upstream issue for https://issues.redhat.com/browse/CNV-50720

Wouldn't it be basically the same as https://github.com/kubevirt/kubevirt/issues/13112 ?

Nov 05 '24 16:11 jean-edouard

@jean-edouard could you please open an upstream issue for https://issues.redhat.com/browse/CNV-50720

Wouldn't it be basically the same as #13112 ?

It's similar, even if we fix #13112 we will still have a problem when the source and the target crash.

In #13112, the controller can observe that the target pod is happily running on the target already, so it can finalize the migration. (one possible solution)

in the case when the source and the target are gone, we don't know how to finalize the migration. We don't know which volume was updated last.

As it is today if the source pod is gone without communicating the results to the handler we cannot tell which volume to choose when the VM starts again.

Nov 05 '24 18:11 vladikr

@jean-edouard could you please open an upstream issue for https://issues.redhat.com/browse/CNV-50720

Wouldn't it be basically the same as #13112 ?

It's similar, even if we fix #13112 we will still have a problem when the source and the target crash.

In #13112, the controller can observe that the target pod is happily running on the target already, so it can finalize the migration. (one possible solution)

in the case when the source and the target are gone, we don't know how to finalize the migration. We don't know which volume was updated last.

As it is today if the source pod is gone without communicating the results to the handler we cannot tell which volume to choose when the VM starts again.

@vladikr do you think we could solve this if it was possible also to query the migration status from the target libvirt? If now, we have 2 sources where we can get this information, then at least, if one is gone, we can always use the second one.

Nov 19 '24 12:11 alicefr

just a reminder - we should update the user-guide regarding this.

Dec 24 '24 14:12 iholder101

kubevirt kubevirt copied to clipboard

backend-storage: add RWO FS support

What this PR does

Why we need it and why it was done in this way

Special notes for your reviewer

Checklist

Release note

kubevirt
kubevirt copied to clipboard