kubevirt icon indicating copy to clipboard operation
kubevirt copied to clipboard

network: Discontinue (and remove) the macvtap network core binding

Open EdDev opened this issue 9 months ago • 1 comments

What this PR does

macvtap [1] network core binding has been marked for deprecation in v1.2 [2] and per the original phase plan it is now discontinued towards v1.3.

Users who would like to continue using the binding are encouraged to use the macvtap network binding plugin [3].

The macvtap network core binding did not reach graduation (GA), therefore its removal does not require any special exception nor commitment to preserve the functionality of existing running VMs. However, in order to ease the upgrade transition for users who are using this network binding, the discontinuation implementation is preserving existing workloads (VMs) running without disruption.

Once the cluster us upgraded, new VMs or migrated VMs will not be able to run with the macvtap network core binding.

Future releases may drop this soft transition and do not commit to keep such old VMs running.

[1] https://kubevirt.io/user-guide/virtual_machines/interfaces_and_networks/#macvtap [2] https://github.com/kubevirt/kubevirt/pull/10924 [3] https://kubevirt.io/user-guide/virtual_machines/net_binding_plugins/macvtap

Fixes #

Why we need it and why it was done in this way

The following tradeoffs were made:

The following alternatives were considered:

Links to places where the discussion took place:

Special notes for your reviewer

Notification for the deprecation and removal has been communicated through the mailing list and v1.2 release, where warnings have been posted to users when macvtap configuration was detected. This final removal is following with the original plan.

Checklist

This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR. Approvers are expected to review this list.

Release note

The 'macvtap' core network binding is discontinued and removed.

EdDev avatar May 13 '24 12:05 EdDev

Commit message "feature-gate, deprecation: Add FG deprecation validation for macvlan" says macvlan, it should be macvtap right?

ormergi avatar May 15 '24 06:05 ormergi

I do not think we can assume the virt-handler reconcile will not run. We have cases where the virt-handler is restarted, when we process a new VMI with a new virt-handler before virt-controller is upgraded. I do not think we should depend on any assumptions about not passing through the network setup more than once, it is unsafe.

Since network-setup run at least once, the macvtap interface is set in the pod and guest. Thus I think there is no need to process macvtap at all, and maybe we can remove mavtap network setup code altogether. Would that work?

The point is that the virt-handler can reconcile old VMIs. The only thing it does with macvtap is to ignore them.

So I think it is safer to make sure virt-handler continues to ignore them (same as it did so far). But in general, whatever it did before, it should continue doing and it should not assume such interfaces cannot appear in the reconciliation code.

The validation is called also for updates, not only for creation.

I though it runs on creation only, could you please refer me to the code that run this validation on VMI update?

You can start following the network validation here, which eventually reaches the FG check here and eventually ends up at the place changed.

The network validation is triggered at both updates and creation for most of the objects.

EdDev avatar May 15 '24 11:05 EdDev

The point is that the virt-handler can reconcile old VMIs. The only thing it does with macvtap is to ignore them.

So I think it is safer to make sure virt-handler continues to ignore them (same as it did so far). But in general, whatever it did before, it should continue doing and it should not assume such interfaces cannot appear in the reconciliation code.

Correct me if I am wrong, but this is why we had the grace period between declaring macvtap deprecated and now discontinue, where in that time it could be used and Kubevirt reconcile these VMs?

You can start following the network validation here, which eventually reaches the FG check here and eventually ends up at the place changed.

The network validation is triggered at both updates and creation for most of the objects.

Thanks for pointing out.

ormergi avatar May 16 '24 07:05 ormergi

Correct me if I am wrong, but this is why we had the grace period between declaring macvtap deprecated and now discontinue, where in that time it could be used and Kubevirt reconcile these VMs?

We warned users that it will be gone, now we do no accept new VMs with it (including the ability to start a stopped one). However, we are trying to keep existing workloads working and not fail them on upgrade.

But it is a good point to consider a new phase in the deprecation, where we can even remove that part. I.e. existing workloads will not continue working. I think we can have that with feature that have not graduated, but it will require some discussion.

For now, I think it is better to be on the safe side and continue as we did with Slirp.

EdDev avatar May 16 '24 10:05 EdDev

We warned users that it will be gone, now we do no accept new VMs with it (including the ability to start a stopped one). However, we are trying to keep existing workloads working and not fail them on upgrade.

But it is a good point to consider a new phase in the deprecation, where we can even remove that part. I.e. existing workloads will not continue working. I think we can have that with feature that have not graduated, but it will require some discussion.

For now, I think it is better to be on the safe side and continue as we did with Slirp.

Existing workload will continue work while Kubevirt will not reconcile the deprecated bindings. From what I understand the network-setup code will not reconcile broken interface binding, because it configure the interface once, is it right?

We announced the deprecation plan [1] including that existing VMs will continue running but wont be able restart [2]. And also provided enough time to prepare for it and got no objection from users, thus I think it fine.

[1] https://groups.google.com/g/kubevirt-dev/c/DoTAGBT8UH4/m/WjArfofqAAAJ?utm_medium=email&utm_source=footer [2] https://github.com/kubevirt/community/pull/249/files

ormergi avatar May 16 '24 11:05 ormergi

I have no objection to postpone removing the binding code, hopefully by next release.

/lgtm

ormergi avatar May 16 '24 14:05 ormergi

/approve

EdDev avatar May 16 '24 14:05 EdDev

@enp0s3 , we have here code which is not under sig-network and I need a root approver to support the changes. Can you please assist?

EdDev avatar May 16 '24 14:05 EdDev

/approve

enp0s3 avatar May 16 '24 17:05 enp0s3

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: EdDev, enp0s3

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

kubevirt-bot avatar May 16 '24 17:05 kubevirt-bot

Required labels detected, running phase 2 presubmits: /test pull-kubevirt-e2e-windows2016 /test pull-kubevirt-e2e-kind-sriov /test pull-kubevirt-e2e-k8s-1.30-ipv6-sig-network /test pull-kubevirt-e2e-k8s-1.28-sig-network /test pull-kubevirt-e2e-k8s-1.28-sig-storage /test pull-kubevirt-e2e-k8s-1.28-sig-compute /test pull-kubevirt-e2e-k8s-1.28-sig-operator /test pull-kubevirt-e2e-k8s-1.29-sig-network /test pull-kubevirt-e2e-k8s-1.29-sig-storage /test pull-kubevirt-e2e-k8s-1.29-sig-compute /test pull-kubevirt-e2e-k8s-1.29-sig-operator

kubevirt-commenter-bot avatar May 16 '24 17:05 kubevirt-commenter-bot

/retest-required This bot automatically retries required jobs that failed/flaked on approved PRs. Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

kubevirt-commenter-bot avatar May 17 '24 02:05 kubevirt-commenter-bot