cloudstack
cloudstack copied to clipboard
SOLIDFIRE: "SolidFire" plugin doesn't work for ROOT volumes with VMware (6.5)
ISSUE TYPE
- Bug Report
COMPONENT NAME
SolidFire plugin ('SolidFire" as opposed to "SolidFire Shared")
CLOUDSTACK VERSION
4.13 (master atm), but also observed by another community member on 4.11.3
CONFIGURATION
OS / ENVIRONMENT
VMware 6.5 tested
SUMMARY
Adding SolidFire Primary Storage via SolidFire plugin for VMware 6.5 fails with errors in mounting Datastore in ESXi hosts and raises the specific error in mgmt logs.
STEPS TO REPRODUCE
in vCenter, add iSCSI Software adapter to each ESXi hosts, configure proper network binding to vSwitchXXX for the iSCSI traffic (i.e. VLAN XXX so that ESXi can communicate with the SVIP on that VLAN), then add SolidFire (Zone wide (same problem with cluster-wide), protocol "Custom, provider "SolidFire", "Managed" box ticked and proper ULR - adding SF as Primary Storage is successful.
Try to spin a VM - that is when the things fail after a minutes or so.
Observing vCenter, the following thing happen (also check the screenshot)
- Static iSCSI target is added to ESXI hosts
- Rescanning HBAs
- Creating datastore same size as the volume/template itself with the name ending in "Centos-5.3-x64" or similar (name of the template)
- Deploying OVF template
- Unregistering VM
- Moving files around
- unmounting VMFS
- Removing iSCSI static targets
- Rescan HBA
- Again adding iSCSI static targets ???
- Rescan HBAs
- Rescan VMFS
- RENAME datastore (from the template-alike name to the root-volume-alike name, ending with ROOT-XXX.YY) (... probably NOW problem happens ???)))
- unmount datastore
- remove iSCSI targets.
The error from the ACS is: message: Datastore '-iqn.2010-01.com.solidfire:hl1k.root-32.29-0' is not accessible. No connected and accessible host is attached to this datastore
The problem is - this datastore (in it's latest, renamed state, still attached to ESXi hosts) - is unmounted, but can't be removed, NOR can I mount it - If I try to manually mount it, I get the vCenter message of "Operation failed, diagnostics report: Unable to find volume uuid[5d7abd9a-273aa9d5-bffe-1e00d4010711] lvm [snap-329aa3ea-5d7abd01-a5c83210-c87c-1e00d4010711] devices"
Screenshot from vCenter attached - note that the last 2 entries (on top) are my attempt to manually mount an existing SF datastore. - i.e. there are zero failures on vCenter side while ACS is doing it's job - something is failing on ACS side.

@skattoju4 /CC @mike-tutkowski FYI ^^^
Small update - it works fine for DATA disks since no renaming of datastore in play.
Seems like removing the static iSCSI targets is the step that breaks the whole thing:
The last few steps from the original screenshot:
- datastore is renamed as: Renamed datastore from snap-13e16b15-iqn.2010-01 .com.solidfire:hi1k.centos53-x64.66-0 to -iqn.2010-01 .com.solidfire:hl1k.root-55.67-0 (fine)
- VMFs is unmounted (per "plan" or not?)
- static iSCSI target is removed - BUT without first deleting the datastore
Since datastore is NOT deleted, but its static iSCSI target is removed - vCenter will complain that the iSCSI path is no longer available for that datastore.
If I manually add static iSCSI target for the "iqn.2010-01 .com.solidfire:hl1k.root-55.67-0" - later you can mount (exisiting) datastore as expected, etc.
Hope this helps further troubleshooting and fixing the issue @skattoju4 /CC @mike-tutkowski
I am not sure what the severity is for this @mike-tutkowski @skattoju4 @andrijapanicsb . Is this something we are going to support or should we just add a line in docs somewhere? @mike-tutkowski is there a dev you know busy with this plugin?
I think this was the issue related to the one opened by Christian from the Fraunhofer institute. He sent us (CloudOps) some logs and we were going to troubleshoot/debug in his environment, however I think priorities changed on their end so this effort was put on hold.
ping @skattoju cc @swill @syed @pdion891 - any update on this?
(moved this back to unplanned unless we hear any devs picking this up)
Hello everyone,
I am from Netapp support team. There is no active development going on for ACS-SF plugin from Netapp but since this seems to be dangling case for quite sometime, thought would drop some input input based on what makes sense to me.
" iSCSI targets " once remove followed by a storage adaptor scan will ideally remove all or any traces of the unsignatured disks previously available through the iscsi server/target ... If it's not removed from the vSphere inventory then this is to be checked from the vSphere and ESXI logs.
My recommendation would be to trace the same from vpxd log ( vsphere log) , vpxa ( esxi log ) , hostd.log ( exsi log) and vmkernel .log (esxi log)
ACS - SF plugin throwing the error as message: .. Datastore '-iqn.2010-01.com.solidfire:hl1k.root-32.29-0' is not accessible. No connected and accessible host is attached to this datastore ..
makes sense because the hosts doesnot have any clue about the target, now the iscsi target(s) is(are) removed.
By any chance do we have the vsphere + esxi log bundle available ???
Regards, KC
@ikchakraborty I think we have not got any infrastructure anywhere to support this. cc @andrijapanicsb @rohityadavcloud
cc @shwstppr @pdion891 any update on this, does this work now, maybe closed?
Hi my name is Aarushi Soni . I want to contribute to this issue . Please guide me through this process.