multipass
multipass copied to clipboard
Multipass clone
Please read the spec first to get a sense of the context.
The code can be divided into several parts:
- client module, the clone command line part.
-
daemon::clone
method indaemon.cpp
, 2.1. generate destination name, this step processes the name options of the input, use the given name or generate a new one if no name is provided. This step also needs the name list of instances to check if the name exist, it also requires the source specVMSpecs::clone_count
field to generate a new name. 2.2. Once the new destination name is produced, we can clone the vmspec object, this step requires copying the vmspec instance and updating theVMSpecs::default_mac_address
andVMSpecs::extra_interfaces
fields. Ideally, this method can be a data member method (calledclone
maybe) inVMSpecs
class. However, given the fact that the dependency code (generate_unused_mac_address
method andallocated_mac_addrs
data) still resides indaemon.cpp
, soclone_spec
is a local lambda method indaemon.cpp
for now.
2.3. clone image record through theVMImageVault::clone
method. 2.4. callscreate_vm_and_clone_instance_dir_data
fromVirtualMachineFactory
. -
remove_all_snapshots_from_the_image
function is added toVirtualMachine
class, it gets called in the factory class and it removes the snapshots from the image file. -
qemu_virtual_machine_factory::create_vm_and_clone_instance_dir_data
, it copies the whole instance folder with the exclusion of snapshot files and updates the unique identifiers incloud-init-config.iso
file. Besides that, it creates the newVirtualMachine
instance and also removes the snapshots from the image file. - auto-completion script of
multipass clone
command line.
Functional testing should cover at least the following scenarios:
- multipass clone command line part, --help, and options, explanation of these options. Error message in different cases.
- Animated spinner works properly in a successful clone case.
- In the case of cloning without a designated name, check if the name generator works right.
- In the case of cloning a general vm instance that has mounts, and snapshots, check if the new instance can run properly, check whether the new instance disk data have the updated unique identifiers.
Some further thoughts on the code architecture (open for discussion):
- Using
VMSpecs::clone_count
to count cloned instances and theBaseVirtualMachine::snapshot_count
to count snapshots is cumbersome. We need a variable and a file or json item to track the value. Besides that, the current name-generation scheme also does not persist on trying. What I mean by that is if the next about-to-be-generated name already exists (if the user already created the snapshot/cloned instance by specifying that name), then the auto name-generation will fail. To fix these things, we can try a different name-generation scheme, that is always looping over the existing instances/snapshots to seek the next available name, this scheme will have to do a linear search but it no longer depends on the counter variable/file and will persist on name generation. The corresponding name-generation behavior will change when there is a gap in the integer sequence (when user deletes the snapshots/instances in between) , the new scheme will reuse that deleted name and the counter based scheme will not. I think this approach is better overall, but it requires some changes in already released behavior. - In the
multipassd-vm-instances.json
file, themetadata::arguments
contains the image and clou-init file absolute path, themultipassd-instance-image-records.json
file also has the path item points to the image file path. These paths in a way added more unnecessary instance name occurrences (the instance folder name in the path) which need to be updated during the clone and renaming process. Then maybe a better disk data layout and C++ code structure can remove these paths. For example, if we split these json files into one item per file and move that into the instance directory, and meanwhile we can moveVMSpecs
intoVirtualMachine
class. With this new data layout and code structure, theVMSpecs
can access the instance directory and further access the multipassd-vm-instances.json file (with his own item), besides that, the data sync between theVMSpecs
andVirtualMachine
class becomes much easier. On top of these, there are more benefits like the rollback mechanism and thread-safety can be much easier. - Make a global
Mac_address_generator
class that can be accessed from everywhere and refactor the corresponding code. If this can be done, then theclone_spec
does not have to be a local lambda indaemon.cpp
.
Codecov Report
Attention: Patch coverage is 92.30769%
with 21 lines
in your changes missing coverage. Please review.
Project coverage is 88.93%. Comparing base (
b698daa
) to head (53c0fd0
). Report is 2 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #3264 +/- ##
==========================================
+ Coverage 88.88% 88.93% +0.04%
==========================================
Files 254 256 +2
Lines 14225 14457 +232
==========================================
+ Hits 12644 12857 +213
- Misses 1581 1600 +19
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@sharder996 Regarding the repeated ip addresses issue on virtualbox extra interface. There are some new commits about that which is an overhaul of the cloud-init network file data.
In order to review that, there are the context and the solution of the problem you have to know about. The culprit is that dhcp packet payload has the Client MAC address
field and option 61: Client identifier
field. The situation before was that all dhcp packets from all backends have this option 61: Client identifier
field as IAID and DUID pair. Thus, after cloning, we have the same option 61 and different mac addresses in the dhcp packet. Different dhcp servers have different logic on which field it looks into. So as a result, the qemu cloned instance gets a different ip address on linux and macos despite the same IAID and DUID pair, because the dhcp server looks into the Client MAC address
field. Hyperv worked because the import and export commands tweak the IAID and the dhcp server looks into the option 61. virtualbox shared the same dhcp server as the hyperv in my environment and it has the same IAID and DUID pair in option 61. It eventually leads to the same ip addresses.
That was about the context of the problem, the solution we came up with is to enforce the option 61 always to be a mac address as opposed to being an IAID and DUID pair. In this case, no matter which field the dhcp server looks into, he will always get a different unique identifier after cloning and, furthermore, a different ip address.
In order to achieve that, we need to modify the cloud-init network file. Every network interface should have the dhcp-identifier
field explicitly set to mac
so option 61 can be changed to mac address. Besides this, the original cloud-init format has the network-config file empty when there is only the default interface. So the default interface should be always present in order to have the dhcp-identifier
field settable by us.
Therefore, we have the new cloud-init format, that is the network file always has the default interface, and always has the dhcp-identifier field with mac value on all network interfaces. The new format applies to any newly launched or cloned instances. The recent code changes are about generating the new format and keeping the backward compatibility.
Ok, still a few outstanding things, but I'm pretty much ready to give my approval. I think @ricab wanted to also give this a once over as well.
I have one other design related comment which I'm not sure it should go, so I'm going to put it here. I'm not expecting it to be implemented here as it would probably require quite a bit of work/redesign, but it would be nice if the code in the daemon could be simplified. IMO, the daemon should be more agnostic to the underlying components of creating a vm (ie, factories, vm vaults, vms, etc) and should be able to simply clone an instance (eg. factory->clone(vm)
). I believe some effort has been made to conform to this design idea, but fully adopting it would be part of a entire daemon rewrite. Again, I'm not expecting it to implemented, just wanted to record my thoughts somewhere where they hopefully won't be lost. :)
Looking better! I think it's almost there.
I'm not sure if I misunderstood, but I though we were going to do away with
string_view
. I saw some of them got removed, but I'm still seeing some others lingering around.
The impression I got was that we want to discuss the usage of std::string_view
as a team and decide what to do. However, we never had the full team to discuss about it (either Andrei or Ricardo is away), so I kind of waited. But now if I think about it again, the usage in tests/test_cloud_init_iso.cpp
is the usage we can definitely ditch. Thus, I can do that without waiting for the team discussion.
I have one other design related comment which I'm not sure it should go, so I'm going to put it here. I'm not expecting it to be implemented here as it would probably require quite a bit of work/redesign, but it would be nice if the code in the daemon could be simplified. IMO, the daemon should be more agnostic to the underlying components of creating a vm (ie, factories, vm vaults, vms, etc) and should be able to simply clone an instance (eg.
factory->clone(vm)
). I believe some effort has been made to conform to this design idea, but fully adopting it would be part of a entire daemon rewrite. Again, I'm not expecting it to implemented, just wanted to record my thoughts somewhere where they hopefully won't be lost. :)
factory->clone(vm)
is better than what it is now. However, the idiomatic solution for this problem normally in the industry is prototype pattern (virtual copy constructor). That would make the function call dest_vm = src_vm->clone();
without attachment to the factory, the VM data type here is VirtualMachine::ShPtr
. The precondition here is that all VM related dada are already moved into VirtualMachine
class, including vmspecs, mount, image record and so on.
By the way, I still have not received the feedback on the 3 items design discussion. It would be nice to know what you guys think.
In response to your architecture discussion points:
For No.1, I think the current name generation pattern is fine. No need to over complicate at this point as it isn't crucial towards the functionality of the feature.
For No.2, this looks like an extension of the already discussed refactor of the VMSpec
, data structures in the Daemon
, etc. I'm thinking that refactor is going to touch a lot of files and might need its own planning/technical spec. Regardless, I think it's outside the scope of this feature.
For No.3, cloning a VMSpec
sounds like something that the VMSpec
should be able to do. Again, this quickly spirals into having to change a bunch of other things as well; make VMSpec
a class, maybe combine other data members of the VirtualMachine
with the VMSpec
, create a singleton MAC address generator, etc. Again, not something that I think is high priority at this point and could be made to be a part of refactoring VMSpec
and the daemon as mentioned in the previous point.