milvus icon indicating copy to clipboard operation
milvus copied to clipboard

fix: Prevent deadlock in runComponent when Prepare fails

Open weiliu1031 opened this issue 3 weeks ago • 6 comments

issue: #45068 pr: #45069 When component.Prepare() fails (e.g., net listener creation error), the sign channel was never closed, causing runComponent to block indefinitely at <-sign. This resulted in the entire process hanging after logging the error message.

Changes:

  • Move close(sign) to defer statement in runComponent goroutine
  • Ensures sign channel is always closed regardless of success/failure
  • Allows proper error propagation through future.Await() mechanism

weiliu1031 avatar Nov 17 '25 09:11 weiliu1031

[ci-v2-notice] Notice: We are gradually rolling out the new ci-v2 system.

  • Legacy CI jobs remain unaffected, you can just ignore ci-v2 if you don't want to run it.
  • Additional "ci-v2/*" checkers will run for this PR to ensure the new ci-v2 system is working as expected.
  • For tests that exist in both v1 and v2, passing in either system is considered PASS.

To rerun ci-v2 checks, comment with:

  • /ci-rerun-code-check // for ci-v2/code-check
  • /ci-rerun-build // for ci-v2/build
  • /ci-rerun-ut-integration // for ci-v2/ut-integration
  • /ci-rerun-ut-go // for ci-v2/ut-go
  • /ci-rerun-ut-cpp // for ci-v2/ut-cpp
  • /ci-rerun-ut // for all ci-v2/ut-integration, ci-v2/ut-go, ci-v2/ut-cpp
  • /ci-rerun-e2e-arm // for ci-v2/e2e-arm

If you have any questions or requests, please contact @zhikunyao.

sre-ci-robot avatar Nov 17 '25 09:11 sre-ci-robot

[INFO] PR Label Summary by Default [ERROR] Failed to check PR #45069: script returned exit code 1

[WARNING] Milestone not set

  • PR: #45626
  • Title: fix: Prevent deadlock in runComponent when Prepare fails Please set a milestone for better release tracking

You can set milestone by commenting: /set-milestone Example: /set-milestone 2.5.0

Use /refresh-label to update related check and label manually

sre-ci-robot avatar Nov 17 '25 09:11 sre-ci-robot

/set-milestone 2.5.23

weiliu1031 avatar Nov 17 '25 10:11 weiliu1031

[INFO] Set milestone to: 2.5.23

sre-ci-robot avatar Nov 17 '25 10:11 sre-ci-robot

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests. :white_check_mark: Project coverage is 82.07%. Comparing base (3a7a08f) to head (b5f0d9f). :warning: Report is 57 commits behind head on 2.5.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##              2.5   #45626      +/-   ##
==========================================
- Coverage   82.10%   82.07%   -0.03%     
==========================================
  Files        1128     1587     +459     
  Lines      179181   248689   +69508     
==========================================
+ Hits       147110   204104   +56994     
- Misses      26099    38581   +12482     
- Partials     5972     6004      +32     
Components Coverage Δ
Client 78.90% <22.22%> (-0.06%) :arrow_down:
Core 84.56% <79.54%> (∅)
Go 82.40% <78.70%> (+0.02%) :arrow_up:
see 517 files with indirect coverage changes
:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Nov 17 '25 11:11 codecov[bot]

[INFO] PR Label Summary by Default [ERROR] Failed to check PR #45069: script returned exit code 1

Use /refresh-label to update related check and label manually

sre-ci-robot avatar Nov 18 '25 03:11 sre-ci-robot

/lgtm

chyezh avatar Nov 18 '25 08:11 chyezh

[INFO] PR Label Summary by Default [ERROR] Failed to check PR #45069: script returned exit code 1

Use /refresh-label to update related check and label manually

sre-ci-robot avatar Nov 18 '25 08:11 sre-ci-robot

/lgtm

chyezh avatar Nov 18 '25 08:11 chyezh

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: congqixia, weiliu1031

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

sre-ci-robot avatar Nov 18 '25 08:11 sre-ci-robot

[INFO] PR Label Summary by Default [ERROR] Failed to check PR #45069: script returned exit code 1

Use /refresh-label to update related check and label manually

sre-ci-robot avatar Nov 18 '25 08:11 sre-ci-robot

/refresh-label

weiliu1031 avatar Nov 18 '25 09:11 weiliu1031

[INFO] PR Label Summary by Refresh-Label

  • Title: fix: Prevent deadlock in runComponent when Prepare fails
  • Target: 2.5
  • Labels: kind/bug, size/M, approved, lgtm, ci-passed, dco-passed, do-not-merge/need-merge-master-first

[ERROR] Failed to check PR #45069: script returned exit code 1

Use /refresh-label to update related check and label manually

sre-ci-robot avatar Nov 18 '25 09:11 sre-ci-robot

/refresh-label

weiliu1031 avatar Nov 18 '25 09:11 weiliu1031

[INFO] PR Label Summary by Refresh-Label

  • Title: fix: Prevent deadlock in runComponent when Prepare fails
  • Target: 2.5
  • Labels: kind/bug, size/M, approved, lgtm, ci-passed, dco-passed, do-not-merge/need-merge-master-first

[SUCCESS] PR #45609 merged to master

  • Title: fix: Prevent deadlock in runComponent when Prepare fails
  • Link: https://github.com/milvus-io/milvus/pull/45609

Use /refresh-label to update related check and label manually

sre-ci-robot avatar Nov 18 '25 09:11 sre-ci-robot