valkey icon indicating copy to clipboard operation
valkey copied to clipboard

Background Job Manager (BJM) - replacement for BIO

Open JimB123 opened this issue 1 year ago • 1 comments

The BIO module is a source of minor annoyance. Background Job Manager (BJM) is designed as a modular replacement for BIO. Assuming there is positive response for this PR, the PR will be extended to convert existing BIO usage to BJM and remove BIO.

Motivations:

  • BIO (Background I/O) is a misnomer. I'd guess that this was originally a mechanism to perform an async flush on the AOF file, but it has since morphed into a mechanism for performing other kinds of background activities - like lazy flush - which have nothing to do with I/O
  • BIO isn't well layered. BIO should be a low-level utility for handling async background tasks. However the design of this utility requires that the "low-level" code have specific knowledge of the "high-level" application.
  • BIO isn't very modular. For each new type of background task, the BIO code must be altered. A new thread is created for each type of task. New definitions and logic are added for each new task.
  • Improve performance. BIO creates multiple threads, but doesn't use them effectively. There is 1 unique thread for each type of task. During a burst of lazy evictions, only the (single) eviction thread will be active while other BIO threads sit idle. During a lazy flush operation, 2 dictionaries (main and expire) are processed sequentially rather than in parallel.
  • Reduced memory usage. BIO uses a list to maintain the queue of jobs. The new Fifo uses much less memory (and is more performant) if there's a large burst of jobs.

PR Contents:

  • Fifo - presents a simple FIFO queue, it is over 50% more space efficient and over 50% faster than using Redis list. It's delivered as an independent, modular, & reusable utility data structure.
  • BJM provides a simple fire-and-forget interface to have something done by a background thread. It maintains a fixed/configurable quantity of threads (rather than one per "job type"). This prevents having too many active background threads and also allows faster processing for a large number of jobs of the same job type.

Originally proposed to Redis and stalled here: https://github.com/redis/redis/pull/12029

Note especially that in the original proposal, there was some concern about WAITAOF which currently relies on BIO to serialize 2 independent background jobs. There were 3 possible solutions mentioned:

  • Retain BIO for this case
  • Modify the WAITAOF code so that it enqueues a single sequenced job rather than 2 jobs
  • Enhance the proposed BJM mechanism with "futures" to provide a mechanism for synchronization

JimB123 avatar Apr 23 '24 18:04 JimB123

Codecov Report

Attention: Patch coverage is 0% with 207 lines in your changes are missing coverage. Please review.

Project coverage is 68.11%. Comparing base (393c8fd) to head (2e9a5ee). Report is 2 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable     #356      +/-   ##
============================================
- Coverage     68.39%   68.11%   -0.28%     
============================================
  Files           108      110       +2     
  Lines         61562    61769     +207     
============================================
- Hits          42107    42076      -31     
- Misses        19455    19693     +238     
Files Coverage Δ
src/fifo.c 0.00% <0.00%> (ø)
src/bjm.c 0.00% <0.00%> (ø)

... and 13 files with indirect coverage changes

codecov[bot] avatar Apr 23 '24 21:04 codecov[bot]