cloudstack
cloudstack copied to clipboard
Block use of internal and external snapshots on KVM
Description
On KVM, there are two types of snapshots: internal and external. Most snapshot/backup solutions use external snapshots on ACS; save for disk-and-memory VM snapshots, which use internal snapshots (this is a limitation with KVM, as far as I know).
However, since internal snapshots are stored inside the VM's volume (hence the name), if an internal snapshot is taken after an external snapshot and the external snapshot is restored, the internal snapshot is lost.
Thus, this PR blocks the use of disk-and-memory VM snapshots alongside volume snapshots, NAS backups, and disk-only VM snapshots (at least the ones created using the default volume snapshot implementation).
I encourage maintainers of 3rd party storage providers to test if their implementation is compatible with disk-and-memory VM snapshots, if it is not it their simultaneous usage should be blocked.
Types of changes
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] New feature (non-breaking change which adds functionality)
- [X] Bug fix (non-breaking change which fixes an issue)
- [ ] Enhancement (improves an existing feature and functionality)
- [ ] Cleanup (Code refactoring and cleanup, that may add test cases)
- [ ] build/CI
- [ ] test (unit or integration test code)
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
- [ ] Major
- [ ] Minor
Bug Severity
- [ ] BLOCKER
- [ ] Critical
- [X] Major
- [ ] Minor
- [ ] Trivial
Screenshots (if appropriate):
How Has This Been Tested?
I created a VM and created a few disk-and-memory VM snapshots on it; then I tried to create NAS backups, volume snapshots and disk-only VM snapshots, all of them gave me an error, which is expected.
I validated that the opposite was also true for the aforementioned cases, e.g., create volume snapshot and try to create disk-and-memory VM snapshot.
I also validated that it was possible to create multiple NAS backups, disk-only VM snapshots and volume snapshots with no issues.
@slavkap @rp- I think it would be interesting to validate if the implementations done for Storpool and Linstor are compatible with disk-and-memory VM snapshots.
Codecov Report
:x: Patch coverage is 0% with 27 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 17.56%. Comparing base (6dc259c) to head (3095fb1).
:warning: Report is 1 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #11039 +/- ##
=========================================
Coverage 17.55% 17.56%
- Complexity 15535 15537 +2
=========================================
Files 5911 5912 +1
Lines 529359 529383 +24
Branches 64655 64660 +5
=========================================
+ Hits 92949 92980 +31
+ Misses 425952 425942 -10
- Partials 10458 10461 +3
| Flag | Coverage Δ | |
|---|---|---|
| uitests | 3.58% <ø> (ø) |
|
| unittests | 18.63% <0.00%> (+<0.01%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
@blueorangutan package
@JoaoJandre a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.
Packaging result [SF]: ✖️ el8 ✖️ el9 ✖️ debian ✖️ suse15. SL-JID 13798
Packaging result [SF]: ✖️ el8 ✖️ el9 ✖️ debian ✖️ suse15. SL-JID 13809
@JoaoJandre :
08:35:48 [ERROR] /jenkins/workspace/acs-centos8-pkg-builder/dist/rpmbuild/BUILD/cloudstack-4.20.2.0-SNAPSHOT/engine/storage/snapshot/src/test/java/org/apache/cloudstack/storage/vmsnapshot/VMSnapshotStrategyKVMTest.java:32:8: Unused import - org.apache.cloudstack.backup.dao.BackupDao. [UnusedImports]
@blueorangutan package
@JoaoJandre a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.
Packaging result [SF]: ✖️ el8 ✖️ el9 ✖️ debian ✖️ suse15. SL-JID 13818
@blueorangutan package
@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 13826
Linstor does currently not support memory snapshots (we check and throw an error if selected). So I guess we are currently not affected by any of this?
@blueorangutan test
@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests
[SF] Trillian test result (tid-13723) Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8 Total time taken: 53428 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr11039-t13723-kvm-ol8.zip Smoke tests completed. 141 look OK, 0 have errors, 0 did not run Only failed and skipped tests results shown below:
| Test | Result | Time (s) | Test File |
|---|
@JoaoJandre this seems to be included in PR #10632 do you still want it in 4.20.2 ?
@JoaoJandre this seems to be included in PR #10632 do you still want it in 4.20.2 ?
@weizhouapache PR #10632 blocks the usage of the feature introduced in #10632 and other incompatible features. This PR purposefully ignores #10632 and adds restrictions to avoid other interactions between internal and external snapshots; such as volume snapshot and disk-and-memory VM snapshot.
They are complementary. When merging this PR forward, care should be taken so that the validations of both PRs do not erase one another (I can make the merge forward if needed).
@JoaoJandre this seems to be included in PR #10632 do you still want it in 4.20.2 ?
@weizhouapache PR #10632 blocks the usage of the feature introduced in #10632 and other incompatible features. This PR purposefully ignores #10632 and adds restrictions to avoid other interactions between internal and external snapshots; such as volume snapshot and disk-and-memory VM snapshot.
They are complementary. When merging this PR forward, care should be taken so that the validations of both PRs do not erase one another (I can make the merge forward if needed).
ok @JoaoJandre I think the best option might be re-target this PR to 4.22 which includes #10632 , to avoid re-work.
aren't we talking 4.20.2 , @weizhouapache ?
aren't we talking 4.20.2 , @weizhouapache ?
sorry, I meant 4.22, not 4.21
if we merge into 4.20.2, the merge forward to 4.22 will be a trouble , as @JoaoJandre mentioned unless we ignore this PR in merge forward, and @JoaoJandre create another PR against 4.22 (needs re-review and re-testing)
@DaanHoogland @weizhouapache I rebased the changes so now I'm targeting main.
@JoaoJandre thanks for the update, overall LGTM. left a small comment
@JoaoJandre this is ready to ship? (asking because I see scattered test reports and am not sure of completeness)
@DaanHoogland a new round of tests would be good since I rebased from 4.20 to main.
@blueorangutan package
@JoaoJandre a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.
Packaging result [SF]: ✖️ el8 ✖️ el9 ✖️ debian ✖️ suse15. SL-JID 14898
@blueorangutan package