cloudstack
cloudstack copied to clipboard
Support for new Feature: Clone a Virtual Machine (#4818)
Description/Report
This PR adds a Clone VM feature at the API level (as part of GSoC project #4818), which enables the creation of a fully-cloned virtual machine with ROOT / DATA disks, and the same system configuration as the original VM. (Currently Supported for KVM hypervisor only)
Steps involved:
- Creation of temporary snapshots (during the clone VM operation) for both ROOT
- Creation of template from the ROOT disk snapshot
- Create a new VM from the template created in step 2
- Automatic assignment of new network resources for the new Cloned VM
- Creation of temporary snapshots for data disk
- Create data disk Volume from the snapshots created in step 5
- Newly created DATA disk created in step 6 is attached to the new VM
- Cleanup of temporary resources (snapshots) and error handling of the clone VM process
Note: The template created in step 2 cannot be cleaned up as the newly created clone VM uses this template
Feature included:
- A new CloneVmCmd API interface to use this feature
- A clone button in the compute instance page supporting the related functionalities
Documentation
- Documentation PR: https://github.com/apache/cloudstack-documentation/pull/236
Types of changes
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] Enhancement (improves an existing feature and functionality)
- [ ] Cleanup (Code refactoring and cleanup, that may add test cases)
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
- [x] Major
- [ ] Minor
Bug Severity
- [ ] BLOCKER
- [ ] Critical
- [ ] Major
- [ ] Minor
- [ ] Trivial
Screenshots (if appropriate):
How Has This Been Tested?
This has been manually tested with a mbx KVM setup on a local machine and mbx kvm setup on GCP
- This feature has been tested on a local Linux system with KVM support (5.3.0-64-generic Ubuntu) and Google Cloud instance (4.9.0-15-amd64 Debian)
- Try with cloudmonkey if the management server is running with default configurations.
cloneVirtualMachine virtualmachineid=<target_vm_id>
will start the cloning process, it'll create a new cloned VM and start it (with all copied data available), network Ip will be assigned to DB instantly and the actual VM will get this IP after a while
- When secondary system VM agents are not available, the cloning process will fail and clean the previously cloned resources
- It'll copy all the data disk content from the target VM no matter whether the data disks are mounted in the VM system or the VM is running
- Temporary resources created (snapshots) during the process will not remain whether the clone succeeds or not
- New smoke test has been added in test_vm_life_cycle.py with class TestCloneVM, which tests the clone of a VM with data disk attached.
- New unit test has been added as validateCloneCondition in the UserVmManagerImpl class
- Manual test of Clone: in the VM page, there's a clone VM button, click it and it'll clone a new VM if with the correct hypervisor setup.
GCP environment setup
- Create a GCP instance with nested virtualization enabled, for detailed requirements see: https://cloud.google.com/compute/docs/instances/nested-virtualization/overview
- After creating the instance, deploy a KVM host using mbx: https://github.com/shapeblue/mbx (centos7)
- Before deploying and configuring the cloudstack agent, enable the system nested kvm by:
cat /sys/module/kvm_intel/parameters/nested
// if return N
sudo modprobe -r kvm_intel
sudo modprobe kvm_intel nested=1
@blueorangutan package
@sureshanaparti a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.
Packaging result: :heavy_check_mark: el7 :heavy_check_mark: el8 :heavy_check_mark: debian. SL-JID 913
@blueorangutan test
@sureshanaparti a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests
Trillian test result (tid-1703) Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7 Total time taken: 34430 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5216-t1703-kvm-centos7.zip Smoke tests completed. 89 look OK, 0 have error(s) Only failed tests results shown below:
Test | Result | Time (s) | Test File |
---|
@atrocitytheme I like the premise and most of your implementation. Two worries:
- your methods in
SnapshotManagerImpl
and inUserVmManagerImpl
are rather long, Can you please extract sensible parts from them (and maybe at times re-use). A good indicator would be if there is a comment, it can probably be used as a camelCase method name and the code below it moved in the new method. - will the usage server pick up cloned machines for billing? (I don't think it will but I might be missing something)
thanks for this PR, good addition.
@DaanHoogland Hi, thanks for the feedback. 1. Sure, I can extract and refactor these implementations. 2. It doesn't pick up for billing. May add this in the future commits
@DaanHoogland Hi, thanks for the feedback. 1. Sure, I can extract and refactor these implementations. 2. It doesn't pick up for billing. May add this in the future commits
ad 1. thanks ad 2. ok than we need to hide it behind a global setting that will enable the operator to disable the feature.
@blueorangutan package
@sureshanaparti a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.
Packaging result: :heavy_check_mark: el7 :heavy_multiplication_x: el8 :heavy_check_mark: debian :heavy_check_mark: suse15. SL-JID 975
@blueorangutan package
@sureshanaparti a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.
Packaging result: :heavy_check_mark: el7 :heavy_check_mark: el8 :heavy_check_mark: debian :heavy_check_mark: suse15. SL-JID 977
@blueorangutan test
@sureshanaparti a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests
Trillian test result (tid-1753) Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7 Total time taken: 35551 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5216-t1753-kvm-centos7.zip Intermittent failure detected: /marvin/tests/smoke/test_vm_life_cycle.py Smoke tests completed. 88 look OK, 1 have error(s) Only failed tests results shown below:
Test | Result | Time (s) | Test File |
---|---|---|---|
test_clone_vm_and_volumes | Error |
100.64 | test_vm_life_cycle.py |
@atrocitytheme you have conflicts. Can you have a look?
Sure On Tue, Sep 14, 2021 at 10:31 AM dahn @.***> wrote:
@atrocitytheme https://github.com/atrocitytheme you have conflicts. Can you have a look?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/apache/cloudstack/pull/5216#issuecomment-919209165, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFSZIU3HNDEA6BBHEVWOURTUB5MF7ANCNFSM5AP7CLEA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
@blueorangutan package
@sureshanaparti a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.
Packaging result: :heavy_multiplication_x: el7 :heavy_multiplication_x: el8 :heavy_multiplication_x: debian :heavy_multiplication_x: suse15. SL-JID 1687
@blueorangutan package
@sureshanaparti a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.
Packaging result: :heavy_check_mark: el7 :heavy_check_mark: el8 :heavy_check_mark: debian :heavy_check_mark: suse15. SL-JID 1688
@blueorangutan ui
@sureshanaparti a Jenkins job has been kicked to build UI QA env. I'll keep you posted as I make progress.
UI build: :heavy_check_mark: Live QA URL: http://qa.cloudstack.cloud:8080/client/pr/5216 (SL-JID-813)
Trillian test result (tid-2549) Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7 Total time taken: 33154 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5216-t2549-kvm-centos7.zip Smoke tests completed. 90 look OK, 1 have errors Only failed tests results shown below:
Test | Result | Time (s) | Test File |
---|---|---|---|
test_clone_vm_and_volumes | Error |
89.55 | test_vm_life_cycle.py |