cloudstack icon indicating copy to clipboard operation
cloudstack copied to clipboard

Support for new Feature: Clone a Virtual Machine (#4818)

Open atrocitytheme opened this issue 3 years ago • 50 comments

Description/Report

This PR adds a Clone VM feature at the API level (as part of GSoC project #4818), which enables the creation of a fully-cloned virtual machine with ROOT / DATA disks, and the same system configuration as the original VM. (Currently Supported for KVM hypervisor only)

Steps involved:

  1. Creation of temporary snapshots (during the clone VM operation) for both ROOT
  2. Creation of template from the ROOT disk snapshot
  3. Create a new VM from the template created in step 2
  4. Automatic assignment of new network resources for the new Cloned VM
  5. Creation of temporary snapshots for data disk
  6. Create data disk Volume from the snapshots created in step 5
  7. Newly created DATA disk created in step 6 is attached to the new VM
  8. Cleanup of temporary resources (snapshots) and error handling of the clone VM process

Note: The template created in step 2 cannot be cleaned up as the newly created clone VM uses this template

Feature included:

  • A new CloneVmCmd API interface to use this feature
  • A clone button in the compute instance page supporting the related functionalities

Documentation

  • Documentation PR: https://github.com/apache/cloudstack-documentation/pull/236

Types of changes

  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [x] New feature (non-breaking change which adds functionality)
  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [ ] Enhancement (improves an existing feature and functionality)
  • [ ] Cleanup (Code refactoring and cleanup, that may add test cases)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • [x] Major
  • [ ] Minor

Bug Severity

  • [ ] BLOCKER
  • [ ] Critical
  • [ ] Major
  • [ ] Minor
  • [ ] Trivial

Screenshots (if appropriate):

Clone Scene

How Has This Been Tested?

This has been manually tested with a mbx KVM setup on a local machine and mbx kvm setup on GCP

  • This feature has been tested on a local Linux system with KVM support (5.3.0-64-generic Ubuntu) and Google Cloud instance (4.9.0-15-amd64 Debian)
  • Try with cloudmonkey if the management server is running with default configurations.
cloneVirtualMachine virtualmachineid=<target_vm_id>

will start the cloning process, it'll create a new cloned VM and start it (with all copied data available), network Ip will be assigned to DB instantly and the actual VM will get this IP after a while

  • When secondary system VM agents are not available, the cloning process will fail and clean the previously cloned resources
  • It'll copy all the data disk content from the target VM no matter whether the data disks are mounted in the VM system or the VM is running
  • Temporary resources created (snapshots) during the process will not remain whether the clone succeeds or not
  • New smoke test has been added in test_vm_life_cycle.py with class TestCloneVM, which tests the clone of a VM with data disk attached.
  • New unit test has been added as validateCloneCondition in the UserVmManagerImpl class
  • Manual test of Clone: in the VM page, there's a clone VM button, click it and it'll clone a new VM if with the correct hypervisor setup.

GCP environment setup

  • Create a GCP instance with nested virtualization enabled, for detailed requirements see: https://cloud.google.com/compute/docs/instances/nested-virtualization/overview
  • After creating the instance, deploy a KVM host using mbx: https://github.com/shapeblue/mbx (centos7)
  • Before deploying and configuring the cloudstack agent, enable the system nested kvm by:
cat /sys/module/kvm_intel/parameters/nested
// if return N
sudo modprobe -r kvm_intel
sudo modprobe kvm_intel nested=1 

atrocitytheme avatar Jul 16 '21 17:07 atrocitytheme

@blueorangutan package

sureshanaparti avatar Aug 18 '21 17:08 sureshanaparti

@sureshanaparti a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

blueorangutan avatar Aug 18 '21 17:08 blueorangutan

Packaging result: :heavy_check_mark: el7 :heavy_check_mark: el8 :heavy_check_mark: debian. SL-JID 913

blueorangutan avatar Aug 18 '21 17:08 blueorangutan

@blueorangutan test

sureshanaparti avatar Aug 18 '21 17:08 sureshanaparti

@sureshanaparti a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

blueorangutan avatar Aug 18 '21 17:08 blueorangutan

Trillian test result (tid-1703) Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7 Total time taken: 34430 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5216-t1703-kvm-centos7.zip Smoke tests completed. 89 look OK, 0 have error(s) Only failed tests results shown below:

Test Result Time (s) Test File

blueorangutan avatar Aug 19 '21 03:08 blueorangutan

@atrocitytheme I like the premise and most of your implementation. Two worries:

  1. your methods in SnapshotManagerImpl and in UserVmManagerImpl are rather long, Can you please extract sensible parts from them (and maybe at times re-use). A good indicator would be if there is a comment, it can probably be used as a camelCase method name and the code below it moved in the new method.
  2. will the usage server pick up cloned machines for billing? (I don't think it will but I might be missing something)

thanks for this PR, good addition.

DaanHoogland avatar Aug 20 '21 09:08 DaanHoogland

@DaanHoogland Hi, thanks for the feedback. 1. Sure, I can extract and refactor these implementations. 2. It doesn't pick up for billing. May add this in the future commits

atrocitytheme avatar Aug 21 '21 19:08 atrocitytheme

@DaanHoogland Hi, thanks for the feedback. 1. Sure, I can extract and refactor these implementations. 2. It doesn't pick up for billing. May add this in the future commits

ad 1. thanks ad 2. ok than we need to hide it behind a global setting that will enable the operator to disable the feature.

DaanHoogland avatar Aug 22 '21 17:08 DaanHoogland

@blueorangutan package

sureshanaparti avatar Aug 23 '21 11:08 sureshanaparti

@sureshanaparti a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

blueorangutan avatar Aug 23 '21 11:08 blueorangutan

Packaging result: :heavy_check_mark: el7 :heavy_multiplication_x: el8 :heavy_check_mark: debian :heavy_check_mark: suse15. SL-JID 975

blueorangutan avatar Aug 23 '21 12:08 blueorangutan

@blueorangutan package

sureshanaparti avatar Aug 23 '21 13:08 sureshanaparti

@sureshanaparti a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

blueorangutan avatar Aug 23 '21 13:08 blueorangutan

Packaging result: :heavy_check_mark: el7 :heavy_check_mark: el8 :heavy_check_mark: debian :heavy_check_mark: suse15. SL-JID 977

blueorangutan avatar Aug 23 '21 13:08 blueorangutan

@blueorangutan test

sureshanaparti avatar Aug 23 '21 17:08 sureshanaparti

@sureshanaparti a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

blueorangutan avatar Aug 23 '21 17:08 blueorangutan

Trillian test result (tid-1753) Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7 Total time taken: 35551 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5216-t1753-kvm-centos7.zip Intermittent failure detected: /marvin/tests/smoke/test_vm_life_cycle.py Smoke tests completed. 88 look OK, 1 have error(s) Only failed tests results shown below:

Test Result Time (s) Test File
test_clone_vm_and_volumes Error 100.64 test_vm_life_cycle.py

blueorangutan avatar Aug 24 '21 04:08 blueorangutan

@atrocitytheme you have conflicts. Can you have a look?

DaanHoogland avatar Sep 14 '21 14:09 DaanHoogland

Sure On Tue, Sep 14, 2021 at 10:31 AM dahn @.***> wrote:

@atrocitytheme https://github.com/atrocitytheme you have conflicts. Can you have a look?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/apache/cloudstack/pull/5216#issuecomment-919209165, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFSZIU3HNDEA6BBHEVWOURTUB5MF7ANCNFSM5AP7CLEA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

atrocitytheme avatar Sep 14 '21 14:09 atrocitytheme

@blueorangutan package

sureshanaparti avatar Nov 08 '21 14:11 sureshanaparti

@sureshanaparti a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

blueorangutan avatar Nov 08 '21 14:11 blueorangutan

Packaging result: :heavy_multiplication_x: el7 :heavy_multiplication_x: el8 :heavy_multiplication_x: debian :heavy_multiplication_x: suse15. SL-JID 1687

blueorangutan avatar Nov 08 '21 14:11 blueorangutan

@blueorangutan package

sureshanaparti avatar Nov 08 '21 16:11 sureshanaparti

@sureshanaparti a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

blueorangutan avatar Nov 08 '21 16:11 blueorangutan

Packaging result: :heavy_check_mark: el7 :heavy_check_mark: el8 :heavy_check_mark: debian :heavy_check_mark: suse15. SL-JID 1688

blueorangutan avatar Nov 08 '21 17:11 blueorangutan

@blueorangutan ui

sureshanaparti avatar Nov 08 '21 18:11 sureshanaparti

@sureshanaparti a Jenkins job has been kicked to build UI QA env. I'll keep you posted as I make progress.

blueorangutan avatar Nov 08 '21 18:11 blueorangutan

UI build: :heavy_check_mark: Live QA URL: http://qa.cloudstack.cloud:8080/client/pr/5216 (SL-JID-813)

blueorangutan avatar Nov 08 '21 18:11 blueorangutan

Trillian test result (tid-2549) Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7 Total time taken: 33154 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5216-t2549-kvm-centos7.zip Smoke tests completed. 90 look OK, 1 have errors Only failed tests results shown below:

Test Result Time (s) Test File
test_clone_vm_and_volumes Error 89.55 test_vm_life_cycle.py

blueorangutan avatar Nov 09 '21 23:11 blueorangutan