virtualization icon indicating copy to clipboard operation
virtualization copied to clipboard

fix(vd): prevent VirtualDisk from stuck in WaitForFirstConsumer phase when VM is attached

Open loktev-d opened this issue 6 months ago • 3 comments

Description

Fix VirtualDisk remaining in WaitForFirstConsumer phase even after VM attachment and provisioning has started.

Why do we need it, and what problem does it solve?

When using WFFC storage class with volume populators:

  1. VD transitions to WaitForFirstConsumer waiting for VM
  2. VM is created and attached to VD
  3. Volume provisioning starts (importer pod running)
  4. Issue: VD controller continues setting phase to WaitForFirstConsumer because DataVolume is in PendingPopulation state, even though the "first consumer" (VM) already exists

This creates perception of "hanging" - users see VD stuck in WFFC for minutes while provisioning is actually running.

What is the expected result?

Checklist

  • [x] The code is covered by unit tests.
  • [x] e2e tests passed.
  • [x] Documentation updated according to the changes.
  • [x] Changes were tested in the Kubernetes cluster manually.

Changelog entries

section: vd
type: fix
summary: VirtualDisk no longer stuck in WaitForFirstConsumer phase after VM attachment.

loktev-d avatar Sep 30 '25 16:09 loktev-d

Reviewer's Guide

This PR refines the handling of WaitForFirstConsumer (WFFC) storage classes by fetching the StorageClass in relevant controllers, checking the DataVolumeRunning condition before setting the VirtualDisk phase, and updating watchers—all to prevent VirtualDisks from appearing stuck once a VM is attached and provisioning has started.

Sequence diagram for VirtualDisk phase transition with WFFC after VM attachment

sequenceDiagram
    participant VD as VirtualDisk Controller
    participant SC as StorageClass
    participant DV as DataVolume
    participant VM as VirtualMachine
    VD->SC: Fetch StorageClass for VD
    SC-->>VD: Return StorageClass with WFFC mode
    VD->DV: Check DataVolumeRunning condition
    DV-->>VD: Return DataVolumeRunning status
    VD->VM: Detect VM attachment to VD
    VD->VD: Set phase to WaitForFirstConsumer only if DVRunning is false and reason is empty
    VD->VD: Transition out of WaitForFirstConsumer if provisioning started

Class diagram for updated VirtualDisk phase handling logic

classDiagram
    class VirtualDisk {
        +Status: Phase, StorageClassName, Conditions
    }
    class StorageClass {
        +VolumeBindingMode
    }
    class DataVolume {
        +Status: Phase, Conditions
    }
    class BlockDeviceHandler {
        +checkVirtualDisksToBeWFFC()
    }
    class WaitForDVStep {
        +setForFirstConsumerIsAwaited()
    }
    VirtualDisk --> StorageClass : fetches
    VirtualDisk --> DataVolume : checks DataVolumeRunning
    BlockDeviceHandler --> VirtualDisk : checks phase
    WaitForDVStep --> VirtualDisk : sets phase
    WaitForDVStep --> DataVolume : checks DVRunning condition

File-Level Changes

Change Details Files
Refine WFFC phase logic across controllers
  • Import storagev1 API and fetch StorageClass in block_device_condition handler
  • Skip phase checks when no StorageClassName is set
  • Enhance wait_for_dv_step and sources to verify DataVolumeRunningCondition status and reason before setting DiskWaitForFirstConsumer
  • Update datavolume_watcher to trigger on changes to the DVRunningCondition reason
internal/block_device_condition.go
internal/source/step/wait_for_dv_step.go
internal/watcher/datavolume_watcher.go
internal/source/sources.go
Extend unit tests to cover WFFC with populators
  • Add storagev1 to scheme and a getWFFCStorageClass helper in block_devices_test.go
  • Set StorageClassName on test VirtualDisk fixtures
  • Inject DataVolumeRunning false condition and VolumeBindingWaitForFirstConsumer in object_ref_cvi_test.go and object_ref_vi_test.go
internal/block_devices_test.go
internal/source/object_ref_cvi_test.go
internal/source/object_ref_vi_test.go

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an issue from a review comment by replying to it. You can also reply to a review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull request title to generate a title at any time. You can also comment @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in the pull request body to generate a PR summary at any time exactly where you want it. You can also comment @sourcery-ai summary on the pull request to (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the pull request to resolve all Sourcery comments. Useful if you've already addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull request to dismiss all existing Sourcery reviews. Especially useful if you want to start fresh with a new review - don't forget to comment @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

  • Contact our support team for questions or feedback.
  • Visit our documentation for detailed guides and information.
  • Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai[bot] avatar Sep 30 '25 16:09 sourcery-ai[bot]

Workflow has started. Follow the progress here: Workflow Run

The target step completed with status: failure.

deckhouse-BOaTswain avatar Oct 03 '25 15:10 deckhouse-BOaTswain

Workflow has started. Follow the progress here: Workflow Run

The target step completed with status: success.

deckhouse-BOaTswain avatar Oct 16 '25 19:10 deckhouse-BOaTswain