butterfly
butterfly copied to clipboard
Ease Butterfly upgrade
Currently, the way to upgrade a Butterfly is to:
- stop old Butterfly
- start new Butterfly
- push all configuration to the new Butterfly
- connect on qemu monitors and recreate each vnic
Here are some possible scenarios to ease upgrade:
1. Easy scenario
- query a configuration dump from the old Butterfly
- stop old Butterfly
- start new Butterfly
- load configuration dump
- connect on qemu monitors and recreate each vnic
Pros:
- Simple procedure => less bugs
- In case of crash of the new Butterfly, the old one can be re-launched using configuration dump
Cons:
- Network downtime during migration
2. Easy scenario with qemu as a server
- query a configuration dump from the old Butterfly
- stop old Butterfly
- start new Butterfly
- load configuration dump
- Butterfly reconnects to existing vnics managed by Qemus
Pros:
- simple procedure
- no need to connect on qemu console and recreate vnics
- in case of crash of the new Butterfly, the old one can be re-launched using configuration dump
Cons:
- network downtime during migration
- should invert vhost-user client and server (qemu become the vhost server)
3. Stream configuration
- launch the new Butterfly and provide him location of the old Butterfly
- new Butterfly use an other available DPDK port
- new Butterfly query old one to switch it in Read Only mode for 1 minute (no configuration change can now be made)
- new Butterfly query the old one, get all it's security groups and apply them
- for each vnic, the new Butterfly will :
- get vnic configuration from the old one
- ask the old Butterfly to disconnect from this particular vhost-user
- create nic and connect
- Send a Read Only mode query to the old Butterfly (reset counter)
- In case of success for all vnics:
- shutdown old butterfly
- new Butterfly listen to API port
- In case of failure of one of the vnics (or the new Butterfly crashed):
- Migration fails, new butterfly shutdown by itself
- On the old Butterfly, the Read Only query timeouts
- All disabled nics from the old Butterfly reconnects to vhost-user, DPDK port is re-enabled again
Pros:
- vnic network downtime is minimal IF
- no operation to make on qemu nics
- fallback in case of error/crash of the new Butterfly
Cons:
- complex and more risky. Must manage all possible case of error
- heavy modifications
- should invert vhost-user client and server (qemu become the vhost server)
- must run 2 Butterfly at the same time: we may have to fix some problems related to hugepage usage and DPDK lock.
We will go for solution 2. Check #275 and #276