rippled icon indicating copy to clipboard operation
rippled copied to clipboard

fix: assertion failure in `JobQueue::stop`

Open a1q123456 opened this issue 4 months ago • 3 comments

High Level Overview of Change

We're getting the assertion failure in Antithesis in JobQueue::stop at line 316 and it's because m_processCount will be 0 when there's a coroutine suspended (either by the first yield() before it reaches the user function or by an yield() in the user function) and nSuspend_ will be 1. The assertion failure indicates that coroutines in rippled do not have threads to resume and exit cleanly when the job queue is getting closed.

This PR makes JobQueue resume all suspended coroutines so that coroutines can check if it should exit after waking up.

Context of Change

Type of Change

  • [x] Bug fix (non-breaking change which fixes an issue)
  • [ ] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • [ ] Refactor (non-breaking change that only restructures code)
  • [ ] Performance (increase or change in throughput and/or latency)
  • [x] Tests (you added tests for code that already exists, or your new feature included in this PR)
  • [ ] Documentation update
  • [ ] Chore (no impact to binary, e.g. .gitignore, formatting, dropping support for older tooling)
  • [ ] Release

API Impact

  • [ ] Public API: New feature (new methods and/or new fields)
  • [ ] Public API: Breaking change (in general, breaking changes should only impact the next api_version)
  • [ ] libxrpl change (any change that may affect libxrpl or dependents of libxrpl)
  • [ ] Peer protocol change (must be backward compatible or bump the peer protocol version)

a1q123456 avatar Sep 08 '25 13:09 a1q123456

Codecov Report

:x: Patch coverage is 85.45455% with 8 lines in your changes missing coverage. Please review. :white_check_mark: Project coverage is 78.6%. Comparing base (2bf77cc) to head (75e402a). :warning: Report is 88 commits behind head on develop.

Files with missing lines Patch % Lines
src/xrpld/core/Coro.ipp 82.6% 4 Missing :warning:
src/xrpld/rpc/handlers/RipplePathFind.cpp 55.6% 4 Missing :warning:
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##           develop   #5774     +/-   ##
=========================================
- Coverage     79.5%   78.6%   -0.9%     
=========================================
  Files          817     818      +1     
  Lines        72198   68983   -3215     
  Branches      8293    8245     -48     
=========================================
- Hits         57392   54189   -3203     
+ Misses       14806   14794     -12     
Files with missing lines Coverage Δ
src/xrpld/core/JobQueue.h 87.0% <100.0%> (-5.4%) :arrow_down:
src/xrpld/core/detail/JobQueue.cpp 90.8% <100.0%> (-1.6%) :arrow_down:
src/xrpld/core/Coro.ipp 92.8% <82.6%> (-7.2%) :arrow_down:
src/xrpld/rpc/handlers/RipplePathFind.cpp 42.1% <55.6%> (-1.5%) :arrow_down:

... and 865 files with indirect coverage changes

Impacted file tree graph

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Sep 08 '25 15:09 codecov[bot]

Seems that we're getting an error in Antithesis, converting this PR to draft.

a1q123456 avatar Sep 16 '25 09:09 a1q123456

I didn't understand the argument(from PR description) that

The assertion failure indicates that coroutines in rippled do not have threads to resume and exit cleanly when the job queue is getting closed.

How come there are no threads? Workers object is alive in JobQueue class. What am I missing here?

pratikmankawde avatar Dec 01 '25 18:12 pratikmankawde