cloudstack
cloudstack copied to clipboard
Fix snapshot deletion on template creation failure
Description
This PR addresses an issue as part of #8946
The issue I've observed is not while taking the snapshot but while creating the template from the snapshot (stack trace also refers the same).
I could not reproduce the original issue of failed snapshot showing as backedup state rather than error (it might have already fixed after 4.17.2), but I saw another serious issue.
The issue is whenever a snapshot is used to create a template or volume and if there is failure in backing up the snapshot to the secondary store and as part of handling that failure MS is deleting the snapshot in primary storage itself.
These changes are introduced as part of the PR https://github.com/apache/cloudstack/pull/5297
- Create a snapshot of a volume (set snapshot.backup.to.secondary = False)
- Create a template from that snapshot
- As part of the creation, MS first tries to backup the snapshot to the secondary storage
- I've made it fail
- MS recognized the failure and as part of failure it is deleting the snapshot on the primary storage (also marking the snapshot_store_ref entry for primary store role as "Destroyed")
Here in this PR, the fix is to handle this case of not deleting the snapshot on primary storage.
Types of changes
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] New feature (non-breaking change which adds functionality)
- [x] Bug fix (non-breaking change which fixes an issue)
- [ ] Enhancement (improves an existing feature and functionality)
- [ ] Cleanup (Code refactoring and cleanup, that may add test cases)
- [ ] build/CI
Bug Severity
- [ ] BLOCKER
- [x] Critical
- [ ] Major
- [ ] Minor
- [ ] Trivial
Screenshots (if appropriate):
How Has This Been Tested?
First lets check the successful scenario
- Create a snapshot of a volume (set snapshot.backup.to.secondary = False)
- Create a template from that snapshot
- As part of the creation, MS first tries to backup the snapshot to the secondary storage
- One new entry in the snapshot_store_ref table will be seen for store_role "Image" in "creating" state
- After the successful creation of template, this entry will be deleted.
Now the failure scenario
- Create a snapshot of a volume (set snapshot.backup.to.secondary = False)
- Create a template from that snapshot
- As part of the creation, MS first tries to backup the snapshot to the secondary storage
- One new entry in the snapshot_store_ref table will be seen for store_role "Image" in "creating" state
- Make the backup operation fail (I've played with the debugger to fail the operation)
- Observe the snapshot_store_ref table, this entry will be deleted and also keeps the existing entry for store_role "Primary" as "Ready" state. (Previously this row was marked as "Destroyed")