kubevirt
kubevirt copied to clipboard
Allow live-migration for pod network in bridge mode
This PR enables the live-migration feaure for pod networking in any mode.
The feature protected by feature gate NetworkAwareLiveMigration
The functionality has been added:
- If MAC address changed: detach / attach interface after live migration to have correct MAC address set
- If MAC address is not changed: link down / link up interface to force VM request new IP address and routes from DHCP
Full proposal is described here https://github.com/kubevirt/community/pull/182
What this PR does / why we need it:
This feature would allow to live-migrate the VMs with pod network in bridge and macvtap modes.
We're at Deckhouse developing virtualization solution based on KubeVirt and Cilium. We'd like to make live-migration working for less latency modes (macvtap and bridge)
Which issue(s) this PR fixes
Special notes for your reviewer:
Slack thread: https://kubernetes.slack.com/archives/C8ED7RKFE/p1652296916563589?thread_ts=1650371688.593359&cid=C8ED7RKFE
Release note:
New feature gate NetworkAwareLiveMigration which allows live-migration for pod network connected in bridge mode
Hi @kvaps. Thanks for your PR.
I'm waiting for a kubevirt member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test
on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.
Once the patch is verified, the new status will be reflected by the ok-to-test
label.
I understand the commands that are listed here.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Some more questions:
- for which binding type would this apply to ? bridge ?...
- did you consider IPAM (when bridge binding is used) in this scenario ? Like, inside the VMI, we will have the same IP address, but what about the pod interface ? Unless the CNI features some sort of sticky IP, once the lease expires, it will get a new IP - which might break the applications inside the VM. And this means that if the CNI features a sticky IP, you'll have 2 live pods with the same IP. This, iiuc, breaks the kubernetes networking model, where each pod can reach any other pod - i.e. which one would it reach ?...
for which binding type would this apply to ? bridge ?...
I tested this with bridge
and macvtap
did you consider IPAM (when bridge binding is used) in this scenario ? Like, inside the VMI, we will have the same IP address, but what about the pod interface ? Unless the CNI features some sort of sticky IP, once the lease expires, it will get a new IP - which might break the applications inside the VM. And this means that if the CNI features a sticky IP, you'll have 2 live pods with the same IP. This, iiuc, breaks the kubernetes networking model, where each pod can reach any other pod - i.e. which one would it reach ?...
Yeah, both scenarios are supported:
- If target pod has the same IP, then only routes will be updated.
- If target pod has different IP, then a new IP will be assigned.
Both are done by setting link down / link up to force DHCP-client renew a lease.
I tested this with https://github.com/cilium/cilium/pull/19789
@kvaps , this is a very interesting proposal. The topic itself was raised multiple times and got stuck due to the corner cases which k8s brings in.
I think it will very useful to push this idea forward through a formal proposal.
I am interested to see the use cases where such an option fits better than the existing masquerade, a discussion on when and how to enable migrations in such setups and the technical details of how such a migration will work from the pod, vm and guest perspective.
At the last point regarding the tech details, I remember the challenge of having the mac address at the source and target pods active for the migration period.
Generally speaking, if this can fit a specific configuration (e.g. specific CNI, guests with/without dhcp, acceptable link-flickering on the vnic, etc), I see no reason why not to have it in as an option. The main risk and limitation is that future features/changes may break it (i.e. if activating a future enhancement will break this migration option, it may still be accepted in and you will not be able to use it).
I hope this can be discussed through a proposal.
/sig network /cc @AlonaKaplan /cc @phoracek
Some code related nits, and requests for unit / e2e tests.
I agree with @EdDev - this would move forward faster if you presented a design proposal of this work.
@kvaps thanks for the PR. I agree with Miguel and Eddy that it would be much easier to continue the discussion over a proposal. Previously @maiqueb suggested two different solutions to the same issue-
- Make the DHCP server advertise very short leases, thus causing the client to re-acquire the lease.
- Using FORCERENEW dhcp server option to force the client to move to the RENEW state.
It would be nice to explain why unlinking/linking the interface is better than one of those options.
Another question I would love to discuss over the proposal-
- Why the migration is not allowed if the MAC address is bot persistent, but allowed in case the IP address is not persistent? What is the difference?
Sorry for long answer
Make the DHCP server advertise very short leases, thus causing the client to re-acquire the lease.
This is not working in every distribution, IIRC some old ubuntu and debians setting IP lifetime for infinity despite the fact DHCP provided lease just from shot amount of time.
Using FORCERENEW dhcp server option to force the client to move to the RENEW state.
This is still RFC and is not supported on some distributions and MS Windows for example
The link down/up method works perfectly in both cases
Why the migration is not allowed if the MAC address is bot persistent, but allowed in case the IP address is not persistent? What is the difference?
Well, we can force renewing IP configuration for the VM (this is not only about the IP address, but also routes). But there is no easy way to force renewing MAC address inside the VM.
However I was also thinking about this, and I think we have two options:
- We could use detach/attach interface
- We could generate cloud-init config with mac address, and push it into VM somehow. But in this way it will also lead to link down/up inside the VM
I think detaching link for working virtual machine can be destructive action if application is binded on specific interface, not just 0.0.0.0
, so we can provide this opportunity only by enabling specific feature flag
Another option is adding extra logic to compare old and new MAC-addresses, and
if they are different: to detach and attach new NIC into VM
if they are same: just set link down then to up to renew DHCP lease
This way we will not depend on any CNI specific annotations.
PR rebased to include CNI agnostic solution.
/retest
Would you consider making this a draft PR to prevent CI from being overloaded ? I am being spammed with failing test lanes from this PR.
@kvaps ^
Sure, sorry
Is there any option to run integration tests locally?
make functests
? You need a running cluster. If it is provided by kubevirtci, you don't need to do anything else.
You can also pass it localized options to indicate which of the tests to run - or edit manually the code to focus on a particular test / suite / etc .
Sorry for spamming, I fixed my tests locally, now trying to fix another ones, most of them are not working in my environment 😅
/retest-required
/retest
Nice!
While we can not provide guarantees, this is at least a good step to reflect the fact of node change to the guest.
Nice!
While we can not provide guarantees, this is at least a good step to reflect the fact of node change to the guest.
+1 very nice idea!
/retest
/retest
/retest
/test pull-kubevirt-e2e-k8s-1.21-sig-compute-migrations /test pull-kubevirt-e2e-k8s-1.21-operator
@kvaps: The specified target(s) for /test
were not found.
The following commands are available to trigger required jobs:
-
/test pull-kubevirt-apidocs
-
/test pull-kubevirt-build
-
/test pull-kubevirt-build-arm64
-
/test pull-kubevirt-check-unassigned-tests
-
/test pull-kubevirt-client-python
-
/test pull-kubevirt-e2e-k8s-1.22-operator
-
/test pull-kubevirt-e2e-k8s-1.22-sig-compute
-
/test pull-kubevirt-e2e-k8s-1.22-sig-compute-migrations
-
/test pull-kubevirt-e2e-k8s-1.22-sig-monitoring
-
/test pull-kubevirt-e2e-k8s-1.22-sig-network
-
/test pull-kubevirt-e2e-k8s-1.22-sig-storage
-
/test pull-kubevirt-e2e-k8s-1.23-operator
-
/test pull-kubevirt-e2e-k8s-1.23-operator-nonroot
-
/test pull-kubevirt-e2e-k8s-1.23-sig-compute
-
/test pull-kubevirt-e2e-k8s-1.23-sig-compute-cgroupsv2
-
/test pull-kubevirt-e2e-k8s-1.23-sig-compute-migrations
-
/test pull-kubevirt-e2e-k8s-1.23-sig-compute-migrations-nonroot
-
/test pull-kubevirt-e2e-k8s-1.23-sig-compute-nonroot
-
/test pull-kubevirt-e2e-k8s-1.23-sig-network
-
/test pull-kubevirt-e2e-k8s-1.23-sig-network-nonroot
-
/test pull-kubevirt-e2e-k8s-1.23-sig-storage
-
/test pull-kubevirt-e2e-k8s-1.23-sig-storage-cgroupsv2
-
/test pull-kubevirt-e2e-k8s-1.23-sig-storage-nonroot
-
/test pull-kubevirt-e2e-k8s-1.24-ipv6-sig-network
-
/test pull-kubevirt-e2e-k8s-1.24-operator
-
/test pull-kubevirt-e2e-k8s-1.24-sig-compute
-
/test pull-kubevirt-e2e-k8s-1.24-sig-network
-
/test pull-kubevirt-e2e-k8s-1.24-sig-storage
-
/test pull-kubevirt-e2e-kind-1.22-sriov
-
/test pull-kubevirt-e2e-kind-1.23-vgpu
-
/test pull-kubevirt-e2e-kind-1.23-vgpu-nonroot
-
/test pull-kubevirt-e2e-windows2016
-
/test pull-kubevirt-generate
-
/test pull-kubevirt-manifests
-
/test pull-kubevirt-prom-rules-verify
-
/test pull-kubevirt-unit-test
-
/test pull-kubevirt-verify-go-mod
-
/test pull-kubevirtci-bump-kubevirt
The following commands are available to trigger optional jobs:
-
/test build-kubevirt-builder
-
/test pull-kubevirt-check-tests-for-flakes
-
/test pull-kubevirt-code-lint
-
/test pull-kubevirt-e2e-arm64
-
/test pull-kubevirt-e2e-k8s-1.22-sig-compute-realtime
-
/test pull-kubevirt-e2e-k8s-1.22-sig-performance
-
/test pull-kubevirt-e2e-k8s-1.23-single-node
-
/test pull-kubevirt-e2e-k8s-1.23-swap-enabled
-
/test pull-kubevirt-e2e-k8s-1.24-sig-storage-dv-gc
-
/test pull-kubevirt-e2e-kind-1.22-sriov-nonroot
-
/test pull-kubevirt-fossa
-
/test pull-kubevirt-gosec
-
/test pull-kubevirt-goveralls
-
/test pull-kubevirt-verify-rpms
Use /test all
to run the following jobs that were automatically triggered:
-
pull-kubevirt-apidocs
-
pull-kubevirt-build
-
pull-kubevirt-build-arm64
-
pull-kubevirt-check-tests-for-flakes
-
pull-kubevirt-check-unassigned-tests
-
pull-kubevirt-client-python
-
pull-kubevirt-code-lint
-
pull-kubevirt-e2e-k8s-1.22-operator
-
pull-kubevirt-e2e-k8s-1.22-sig-compute
-
pull-kubevirt-e2e-k8s-1.22-sig-compute-migrations
-
pull-kubevirt-e2e-k8s-1.22-sig-network
-
pull-kubevirt-e2e-k8s-1.22-sig-performance
-
pull-kubevirt-e2e-k8s-1.22-sig-storage
-
pull-kubevirt-e2e-k8s-1.23-operator
-
pull-kubevirt-e2e-k8s-1.23-operator-nonroot
-
pull-kubevirt-e2e-k8s-1.23-sig-compute
-
pull-kubevirt-e2e-k8s-1.23-sig-compute-migrations
-
pull-kubevirt-e2e-k8s-1.23-sig-compute-migrations-nonroot
-
pull-kubevirt-e2e-k8s-1.23-sig-compute-nonroot
-
pull-kubevirt-e2e-k8s-1.23-sig-network
-
pull-kubevirt-e2e-k8s-1.23-sig-network-nonroot
-
pull-kubevirt-e2e-k8s-1.23-sig-storage
-
pull-kubevirt-e2e-k8s-1.23-sig-storage-nonroot
-
pull-kubevirt-e2e-k8s-1.24-ipv6-sig-network
-
pull-kubevirt-e2e-k8s-1.24-operator
-
pull-kubevirt-e2e-k8s-1.24-sig-compute
-
pull-kubevirt-e2e-k8s-1.24-sig-network
-
pull-kubevirt-e2e-k8s-1.24-sig-storage
-
pull-kubevirt-e2e-kind-1.22-sriov
-
pull-kubevirt-e2e-kind-1.23-vgpu
-
pull-kubevirt-e2e-kind-1.23-vgpu-nonroot
-
pull-kubevirt-e2e-windows2016
-
pull-kubevirt-fossa
-
pull-kubevirt-generate
-
pull-kubevirt-goveralls
-
pull-kubevirt-manifests
-
pull-kubevirt-prom-rules-verify
-
pull-kubevirt-unit-test
In response to this:
/test pull-kubevirt-e2e-k8s-1.21-sig-compute-migrations /test pull-kubevirt-e2e-k8s-1.21-operator
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
All tests passed 🎉 🎉 🎉
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign enp0s3 for approval by writing /assign @enp0s3
in a comment. For more information see:The Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
Thanks for your pull request. Before we can look at it, you'll need to add a 'DCO signoff' to your commits.
:memo: Please follow instructions in the contributing guide to update your commits with the DCO
Full details of the Developer Certificate of Origin can be found at developercertificate.org.
The list of commits missing DCO signoff:
- 027dc4b Merge pull request #8286 from kubevirt-bot/cherry-pick-8245-to-release-0.56
- 57429a4 Merge pull request #8285 from kubevirt-bot/cherry-pick-8268-to-release-0.56
- b1dbd1b Merge pull request #8303 from kubevirt-bot/cherry-pick-8139-to-release-0.56
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
@kvaps: The following tests failed, say /retest
to rerun all failed tests or /retest-required
to rerun all mandatory failed tests:
Test name | Commit | Details | Required | Rerun command |
---|---|---|---|---|
pull-kubevirt-e2e-k8s-1.21-operator | fb8bb989171e7caaedbd9a06d3e4dcf6fd276207 | link | true | /test pull-kubevirt-e2e-k8s-1.21-operator |
pull-kubevirt-e2e-k8s-1.21-sig-compute-migrations | b8e1f3669fc98c73545a738f324cee2d086c0cea | link | true | /test pull-kubevirt-e2e-k8s-1.21-sig-compute-migrations |
pull-kubevirt-e2e-k8s-1.23-sig-compute-cgroupsv2 | f545ae7c6be40f281ccb00e5a90acc3411625f40 | link | true | /test pull-kubevirt-e2e-k8s-1.23-sig-compute-cgroupsv2 |
pull-kubevirt-e2e-k8s-1.23-sig-storage-cgroupsv2 | f545ae7c6be40f281ccb00e5a90acc3411625f40 | link | true | /test pull-kubevirt-e2e-k8s-1.23-sig-storage-cgroupsv2 |
pull-kubevirt-fossa | b7ceec14e12643351a4d00e65f8565d9cc93871e | link | false | /test pull-kubevirt-fossa |
pull-kubevirt-verify-go-mod | 58839bccde5edc39af22035eb868e3ac3bc39e6d | link | true | /test pull-kubevirt-verify-go-mod |
pull-kubevirt-verify-rpms | d9ec4101d84c974941295def91e645f8921f1423 | link | false | /test pull-kubevirt-verify-rpms |
pull-kubevirt-e2e-kind-1.23-vgpu-nonroot | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-kind-1.23-vgpu-nonroot |
pull-kubevirt-e2e-k8s-1.23-sig-compute-nonroot | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.23-sig-compute-nonroot-0.56 |
pull-kubevirt-e2e-k8s-1.22-sig-compute-migrations | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.22-sig-compute-migrations-0.56 |
pull-kubevirt-check-unassigned-tests | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-check-unassigned-tests-0.56 |
pull-kubevirt-e2e-k8s-1.22-sig-storage | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.22-sig-storage |
pull-kubevirt-e2e-k8s-1.23-sig-compute-migrations-nonroot | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.23-sig-compute-migrations-nonroot-0.56 |
pull-kubevirt-e2e-windows2016 | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-windows2016 |
pull-kubevirt-e2e-k8s-1.23-sig-network | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.23-sig-network |
pull-kubevirt-e2e-kind-1.23-vgpu | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-kind-1.23-vgpu-0.56 |
pull-kubevirt-e2e-kind-1.22-sriov | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-kind-1.22-sriov-0.56 |
pull-kubevirt-goveralls | d9ec4101d84c974941295def91e645f8921f1423 | link | false | /test pull-kubevirt-goveralls-0.56 |
pull-kubevirt-e2e-kind-1.22-sriov-nonroot | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-kind-1.22-sriov-nonroot |
pull-kubevirt-e2e-k8s-1.23-sig-network-nonroot | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.23-sig-network-nonroot |
pull-kubevirt-e2e-k8s-1.23-sig-storage-nonroot | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.23-sig-storage-nonroot |
pull-kubevirt-e2e-k8s-1.24-sig-storage-dv-gc | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.24-sig-storage-dv-gc |
pull-kubevirt-check-tests-for-flakes | d9ec4101d84c974941295def91e645f8921f1423 | link | false | /test pull-kubevirt-check-tests-for-flakes |
pull-kubevirt-e2e-k8s-1.23-sig-compute-migrations | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.23-sig-compute-migrations |
pull-kubevirt-e2e-k8s-1.24-sig-compute-migrations-nonroot | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.24-sig-compute-migrations-nonroot |
pull-kubevirt-e2e-k8s-1.24-sig-network | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.24-sig-network |
pull-kubevirt-e2e-k8s-1.23-sig-compute | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.23-sig-compute |
pull-kubevirt-e2e-k8s-1.22-sig-network | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.22-sig-network |
pull-kubevirt-e2e-k8s-1.24-sig-storage | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.24-sig-storage |
pull-kubevirt-e2e-k8s-1.24-sig-compute | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.24-sig-compute |
pull-kubevirt-e2e-k8s-1.24-ipv6-sig-network | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.24-ipv6-sig-network |
pull-kubevirt-e2e-k8s-1.22-sig-compute | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.22-sig-compute |
pull-kubevirt-e2e-k8s-1.24-sig-compute-migrations | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.24-sig-compute-migrations |
pull-kubevirt-e2e-k8s-1.23-operator | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.23-operator |
pull-kubevirt-e2e-k8s-1.23-sig-storage | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.23-sig-storage |
pull-kubevirt-e2e-k8s-1.24-operator | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.24-operator |
pull-kubevirt-e2e-k8s-1.23-operator-nonroot | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.23-operator-nonroot |
pull-kubevirt-e2e-k8s-1.22-operator | d9ec4101d84c974941295def91e645f8921f1423 | link | true | /test pull-kubevirt-e2e-k8s-1.22-operator |
pull-kubevirt-e2e-k8s-1.22-sig-performance | d9ec4101d84c974941295def91e645f8921f1423 | link | false | /test pull-kubevirt-e2e-k8s-1.22-sig-performance |
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign enp0s3 for approval by writing /assign @enp0s3
in a comment. For more information see:The Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
/lifecycle rotten
remove-lifecycle rotten
/remove-lifecycle rotten
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign dhiller for approval by writing /assign @dhiller
in a comment. For more information see the Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
PR rebased on v0.59
old versions:
- v0.57 - 562c8bccc88725f5a015baa06e92750665440cca
- v0.58 - 1a161d3430064e25e6e1d31dc2aca2ae24fb9463