cloudstack Fix potential leaking of volume maps

Fix potential leaking of volume maps

Open mlsorensen opened this issue 2 years ago • 0 comments

ISSUE TYPE

Enhancement Request

COMPONENT NAME

Storage

CLOUDSTACK VERSION

Any

SUMMARY

Unmapping volumes from a hypervisor host upon VM stop or migration is done on a best-effort basis. The VM is already stopped, or already migrated, we try to unmap, but if something goes wrong there is really no recourse or retry and a warning is logged. This leaves a potential of leaking maps to hosts over time.

In code review I've also found edge cases where a VM is moved to "Stopped" state without necessarily cleaning up network or volume resources, these can also lead to leaked maps over time. Examples are force removing a hypervisor host with running VMs on it, and possibly any other code that just calls vm.setState(State.Stopped).

My request is that we be more thorough during VM start in ensuring that our target host and only our target host has access to the volume. Or at least call the storage plugin involved to let it decide how to do this. It should be as simple as calling the storage service to "revoke all" just before we grant access, or allowing for an exclusive grant in the storage API.

For example, with the PowerFlex/ScaleIO storage client there is an unmapVolumeFromAllSdcs that could be called just prior to granting access to volumes during VM start.

We may need to add a revokeAllAccess() method to the PrimaryDataStoreDriver, or add a flag to the existing revokeAccess to indicate that the storage driver should revoke all.

Or alternatively (I think I like this better), the grantAccess() call might gain a flag boolean exclusive so the storage driver can be instructed to ensure that only one mapping exists - the one requested. This would be cleaner.

Crucially - we need to avoid exclusive access during the live migration workflows. It seems safe to ensure exclusive access during VM start, however.

Sep 16 '22 15:09 mlsorensen

cloudstack cloudstack copied to clipboard

Fix potential leaking of volume maps

ISSUE TYPE

COMPONENT NAME

CLOUDSTACK VERSION

SUMMARY

cloudstack
cloudstack copied to clipboard