cloudstack icon indicating copy to clipboard operation
cloudstack copied to clipboard

Ignore calls to PowerFlex for host revocation when host is null

Open mlsorensen opened this issue 2 years ago • 1 comments

Signed-off-by: Marcus Sorensen [email protected]

Description

This PR Fixes #6739 (for PowerFlex/ScaleIO only, Datera still needs to be addressed), which can occur if the last host the VM ran on is deleted from CloudStack. At the point the VM is deleted, cloudstack attempts to make a final call to revoke access to volumes, passing the last host the VM ran on. If this host is gone, we get an error and are unable to delete the VM.

It's possible that there may be a more holistic fix to this by identifying all of the places where revokeAccess() is called and checking for null host. It's possible other storage plugins don't even need host information to revoke access to volumes and may need this call to revoke. Therefore I'm only applying this fix to the ScaleIOPrimaryDataStoreDriver to skip revoking access when there is no host to revoke access for, and this should protect us as well when a new part of the code tries to use revokeAccess() in the future.

Types of changes

  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] New feature (non-breaking change which adds functionality)
  • [x] Bug fix (non-breaking change which fixes an issue)
  • [ ] Enhancement (improves an existing feature and functionality)
  • [ ] Cleanup (Code refactoring and cleanup, that may add test cases)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • [ ] Major
  • [x] Minor

Bug Severity

  • [ ] BLOCKER
  • [ ] Critical
  • [ ] Major
  • [x] Minor
  • [ ] Trivial

Screenshots (if appropriate):

How Has This Been Tested?

Tested in a local branch against PowerFlex primary storage, and with internal testing environments. Repeated steps outlined in #6739. We also had quite a few VMs that would not delete before this change was put into place, which cleaned up successfully afterward.

mlsorensen avatar Sep 15 '22 19:09 mlsorensen

Codecov Report

Merging #6742 (8fd6879) into main (bf4e905) will increase coverage by 0.18%. The diff coverage is 0.00%.

@@             Coverage Diff              @@
##               main    #6742      +/-   ##
============================================
+ Coverage     10.41%   10.60%   +0.18%     
- Complexity     6694     6848     +154     
============================================
  Files          2455     2466      +11     
  Lines        243141   244552    +1411     
  Branches      38062    38263     +201     
============================================
+ Hits          25335    25927     +592     
- Misses       214636   215345     +709     
- Partials       3170     3280     +110     
Impacted Files Coverage Δ
...atastore/driver/ScaleIOPrimaryDataStoreDriver.java 0.00% <0.00%> (ø)
...rce/wrapper/LibvirtResizeVolumeCommandWrapper.java 49.50% <0.00%> (-27.17%) :arrow_down:
.../cloud/hypervisor/kvm/storage/KVMPhysicalDisk.java 70.27% <0.00%> (-10.98%) :arrow_down:
...pper/LibvirtPrepareForMigrationCommandWrapper.java 43.10% <0.00%> (-10.09%) :arrow_down:
...loud/hypervisor/kvm/resource/LibvirtSecretDef.java 60.00% <0.00%> (-3.16%) :arrow_down:
.../hypervisor/kvm/storage/ScaleIOStorageAdaptor.java 10.48% <0.00%> (-2.63%) :arrow_down:
...apache/cloudstack/storage/volume/VolumeObject.java 35.75% <0.00%> (-2.61%) :arrow_down:
...vm/resource/wrapper/LibvirtStopCommandWrapper.java 42.66% <0.00%> (-1.78%) :arrow_down:
...in/java/com/cloud/api/query/vo/TemplateJoinVO.java 38.09% <0.00%> (-1.15%) :arrow_down:
...a/com/cloud/api/query/dao/TemplateJoinDaoImpl.java 16.29% <0.00%> (-0.83%) :arrow_down:
... and 88 more

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov[bot] avatar Sep 15 '22 20:09 codecov[bot]

@blueorangutan package

rohityadavcloud avatar Sep 28 '22 07:09 rohityadavcloud

@rohityadavcloud a Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan avatar Sep 28 '22 07:09 blueorangutan

Packaging result: :heavy_check_mark: el7 :heavy_check_mark: el8 :heavy_check_mark: debian :heavy_check_mark: suse15. SL-JID 4300

blueorangutan avatar Sep 28 '22 08:09 blueorangutan

@blueorangutan test

rohityadavcloud avatar Sep 29 '22 05:09 rohityadavcloud

@rohityadavcloud a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

blueorangutan avatar Sep 29 '22 05:09 blueorangutan

Trillian test result (tid-5039) Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7 Total time taken: 40692 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr6742-t5039-kvm-centos7.zip Smoke tests completed. 102 look OK, 1 have errors, 0 did not run Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_08_upgrade_kubernetes_ha_cluster Failure 671.36 test_kubernetes_clusters.py

blueorangutan avatar Sep 29 '22 17:09 blueorangutan

@blueorangutan package

rohityadavcloud avatar Oct 07 '22 05:10 rohityadavcloud

@rohityadavcloud a Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan avatar Oct 07 '22 05:10 blueorangutan

Packaging result: :heavy_check_mark: el7 :heavy_check_mark: el8 :heavy_check_mark: debian :heavy_check_mark: suse15. SL-JID 4368

blueorangutan avatar Oct 07 '22 06:10 blueorangutan

@blueorangutan test

rohityadavcloud avatar Oct 07 '22 10:10 rohityadavcloud

@rohityadavcloud a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

blueorangutan avatar Oct 07 '22 10:10 blueorangutan

Trillian test result (tid-5080) Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7 Total time taken: 45093 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr6742-t5080-kvm-centos7.zip Smoke tests completed. 103 look OK, 1 have errors, 0 did not run Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_03_create_redundant_VPC_1tier_2VMs_2IPs_2PF_ACL_reboot_routers Failure 466.35 test_vpc_redundant.py

blueorangutan avatar Oct 07 '22 23:10 blueorangutan