kube-ovn icon indicating copy to clipboard operation
kube-ovn copied to clipboard

Use static IP for VM that live migration will be reports IP conflict

Open Liamlu28 opened this issue 4 years ago • 9 comments

What happened: kube ovn solves static IP for VM, but meets a problem while doing VM live migration, if it binds the same IP to the migration pod, it reports IP conflict as the two PODs have the same IP. image image image image image

Liamlu28 avatar Jan 28 '21 02:01 Liamlu28

xref https://github.com/kubevirt/kubevirt/issues/4910

halfcrazy avatar Jan 28 '21 02:01 halfcrazy

I have a proposal to introduce a new annotation such as ovn.kubernetes.io/ip-taken: true for fixed-ip usage scenarios.

With this annotation, if a pod with the same IP is specified to start, the port of the old pod will be removed from ovn immediately even if the old pod does not start stopping, and the new pod can start with the same IP. Note: During this time the old Pod will immediately lose network. For VM live-migrate or Pod rolling upgrade scenarios(no need for an extra IP or maxUnavailable 1) can be benefited from this if the user is aware of the risks.

What do you think? @oilbeater

halfcrazy avatar Jan 29 '21 17:01 halfcrazy

There may be something to take care of

  1. Pod IP in kubernetes is detected by kubelet, so whatever we remove the old pod's port from ovn, kubelet will always think the old pod has the legacy IP. Execute kubectl get pod we can see two pods have the same IP address while the old one does not actually take the IP.
  2. Remove a port from ovn-nb is easy, but if we want to remove the port from ovs bridge things get difficult as we need to call ovs-vsctl from pod's host.
  3. Not sure if lsp can be renamed, maybe we can keep lsp and rename to new Pod, then reregister with the new pod's ovs port.

halfcrazy avatar Jan 29 '21 17:01 halfcrazy

  • Remove a port from ovn-nb is easy, but if we want to remove the port from ovs bridge things get difficult as we need to call ovs-vsctl from pod's host.

When port is removed from ovn-nb, the flows will be updated and the old ovs port will not work. We can leave it to cni del event to automatically clean the port.

  • Not sure if lsp can be renamed, maybe we can keep lsp and rename to new Pod, then reregister with the new pod's ovs port.

For consistency, I prefer to use a new lsp instead of renaming.

I'm not sure if it's the best time to stop the old vm traffic when the migration vm pod starting. However it can make the migration continue. What do you think of it @ironxyz

oilbeater avatar Feb 01 '21 02:02 oilbeater

Well, I think the current LSP of the source POD should be uninstalled first, when the target POD is started, the LSP will be mounted to the target POD. There will be a short network packet loss, but it is acceptable @oilbeater Some reference links: https://docs.openstack.org/neutron/victoria/contributor/internals/live_migration.html https://blog.csdn.net/jmilk/article/details/88561801 (Chinese)

Liamlu28 avatar Feb 01 '21 10:02 Liamlu28

https://bugzilla.redhat.com/show_bug.cgi?id=1369362

Some info about vm migration with ovn. It remove the iface-id from old ovs port and set it to the new port. This will decrease the network down time. However as openstack use vm id as lsp name which is identical for both old and new vm instance, kube-ovn use podname as the lsp name which is different between old and new pod. We can not easily transfer the iface-id from one port to another

oilbeater avatar Feb 01 '21 10:02 oilbeater

@ironxyz will kubevirt emit events in each stage of a live migration? I still want to find a timing to decrease the network down time, like do the port transfer in post-live-migration stage

oilbeater avatar Feb 01 '21 10:02 oilbeater

kubevirt impl https://github.com/kubevirt/kubevirt/pull/1550

halfcrazy avatar Feb 04 '21 02:02 halfcrazy

I think it has been implemented in https://github.com/kubeovn/kube-ovn/pull/1001 and https://github.com/kubeovn/kube-ovn/pull/1163

chestack avatar Mar 01 '22 13:03 chestack