FerretDB icon indicating copy to clipboard operation
FerretDB copied to clipboard

Fix capped collection cleanup

Open wazir-ahmed opened this issue 5 months ago • 1 comments

Description

This PR fixes two minor issues with cleanupCappedCollection()

  1. When a capped collection is configured only with size, without max option, cleanup function deletes documents from the collection even without size threshold getting reached.
  2. With default CappedCleanupPercentage set to 10, if we insert less than 10 documents in a capped collection, all of the documents gets deleted during cleanup cycle.

Readiness checklist

  • [ ] I added/updated unit tests (and they pass).
  • [ ] I added/updated integration/compatibility tests (and they pass).
  • [ ] I added/updated comments and checked rendering.
  • [x] I made spot refactorings.
  • [ ] I updated user documentation.
  • [x] I ran task all, and it passed.
  • [ ] I ensured that PR title is good enough for the changelog.
  • [ ] (for maintainers only) I set Reviewers (@FerretDB/core), Milestone (Next), Labels, Project and project's Sprint fields.
  • [ ] I marked all done items in this checklist.

wazir-ahmed avatar Feb 23 '24 15:02 wazir-ahmed

Codecov Report

Attention: Patch coverage is 82.35294% with 12 lines in your changes are missing coverage. Please review.

Project coverage is 73.94%. Comparing base (419f9a0) to head (9c21d53).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4118      +/-   ##
==========================================
- Coverage   75.01%   73.94%   -1.08%     
==========================================
  Files         339      339              
  Lines       22827    22862      +35     
==========================================
- Hits        17124    16905     -219     
- Misses       4410     4675     +265     
+ Partials     1293     1282      -11     
Files Coverage Δ
integration/setup/listener.go 70.74% <100.00%> (-9.39%) :arrow_down:
integration/setup/setup.go 84.74% <100.00%> (+1.12%) :arrow_up:
integration/setup/setup_compat.go 87.50% <100.00%> (-5.69%) :arrow_down:
internal/handler/handler.go 50.86% <78.57%> (+7.41%) :arrow_up:

... and 27 files with indirect coverage changes

Flag Coverage Δ
filter-false ?
filter-true 67.42% <82.35%> (-1.44%) :arrow_down:
hana-1 ?
integration 67.42% <82.35%> (-1.48%) :arrow_down:
mongodb-1 5.13% <5.88%> (+<0.01%) :arrow_up:
mysql-1 ?
mysql-2 ?
mysql-3 ?
postgresql-1 46.54% <17.64%> (-0.14%) :arrow_down:
postgresql-2 49.71% <82.35%> (+0.03%) :arrow_up:
postgresql-3 49.83% <17.64%> (-0.19%) :arrow_down:
sqlite-1 45.64% <17.64%> (-0.16%) :arrow_down:
sqlite-2 48.73% <82.35%> (-0.11%) :arrow_down:
sqlite-3 49.01% <17.64%> (-0.15%) :arrow_down:
unit 32.51% <0.00%> (-0.05%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

codecov[bot] avatar Feb 23 '24 15:02 codecov[bot]

@chilagrow I'm not sure what would be the right way to verify this.

  • The problem is, the default cleanup interval is 1min (configurable through CLI). Because of this, we cannot add an integration test right away, the test case has to wait for 1min before verifying the behavior.
  • To override this, we can introduce new fields in setup.SetupOpts to configure the cleanup percentage and interval. We can reduce the interval to 1sec (or 5 second) and test the cleanup behavior.

Let me know your thoughts.

wazir-ahmed avatar Feb 27 '24 14:02 wazir-ahmed

@chilagrow I'm not sure what would be the right way to verify this.

  • The problem is, the default cleanup interval is 1min (configurable through CLI). Because of this, we cannot add an integration test right away, the test case has to wait for 1min before verifying the behavior.
  • To override this, we can introduce new fields in setup.SetupOpts to configure the cleanup percentage and interval. We can reduce the interval to 1sec (or 5 second) and test the cleanup behavior.

Let me know your thoughts.

@wazir-ahmed I see your point. I think you've identified bugs that we were not able to see which is awesome. I believe we couldn't identify it due to not being able to test the logic. How about extracting the logic of cleanup count to a function something such as getCleanupCount(cappedCleanupPercentage, cInfo, statsBefore) int64 and assign returned value to the limit if that's not zero? That function can be tested with varying values of cInfo and statsBefore :hugs:

chilagrow avatar Feb 28 '24 01:02 chilagrow

@rumyantseva @chilagrow Hey, I didn't add a new unit test as suggested. Instead, I have used existing integration test TestCommandsAdministrationCompactCapped and added additional test case.

Also, I have refactored cleanupCappedCollection to make the feature more close to MongoDB w.r.t max document parameter. Previously, we were dropping 10% of the documents if document count exceeds. Imagine, the max doc configured is 1M and user inserts the 1M+1th document, we would drop 100k documents, instead of dropping 1 document. The new logic addressed this issue and only drops the surplus documents when document count exceeds.

wazir-ahmed avatar Mar 14 '24 14:03 wazir-ahmed