arq icon indicating copy to clipboard operation
arq copied to clipboard

Add Redis Streams option for job delivery

Open ajac-zero opened this issue 1 year ago • 8 comments
trafficstars

This pull request adds a basic implementation of Redis Streams, in order to avoid polling for new jobs in the worker and reduce latency, in accordance with objective 4 of issue #437.

To create a worker that listens to a Redis Stream, we can use the cli or specify it in the code directly.

CLI: arq worker.WorkerSettings --stream

Code:

class WorkerSettings:
    functions = [...]
    stream = True
    ...

On the client, they must specify that they want to deliver a job to a worker through a Redis Stream.

redis = await create_pool(RedisSettings())
await redis.enqueue_job('hello_world', _use_stream=True)

Here are the results of a very simple benchmark that showcases the potential of using Redis Streams for improved latency.

Polling: Captura de pantalla 2024-05-04 a la(s) 8 12 28 a m Average time: 0.268s

Streaming: Captura de pantalla 2024-05-04 a la(s) 8 15 34 a m Average time: 0.012s

ajac-zero avatar May 04 '24 15:05 ajac-zero

Codecov Report

Attention: Patch coverage is 82.92683% with 7 lines in your changes missing coverage. Please review.

Project coverage is 95.93%. Comparing base (94cd878) to head (5747d48). Report is 11 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #451      +/-   ##
==========================================
- Coverage   96.27%   95.93%   -0.35%     
==========================================
  Files          11       11              
  Lines        1074     1107      +33     
  Branches      209      199      -10     
==========================================
+ Hits         1034     1062      +28     
- Misses         19       23       +4     
- Partials       21       22       +1     
Files Coverage Δ
arq/connections.py 90.06% <100.00%> (-0.01%) :arrow_down:
arq/constants.py 100.00% <100.00%> (ø)
arq/cli.py 96.49% <60.00%> (-3.51%) :arrow_down:
arq/worker.py 96.50% <83.33%> (-0.67%) :arrow_down:

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 1315583...5747d48. Read the comment docs.

codecov[bot] avatar May 04 '24 15:05 codecov[bot]

@ajac-zero This may need a unit-test with stream enabled.

gaby avatar May 07 '24 02:05 gaby

Sure thing @gaby . I was wondering how I should go about that...

What I started doing was add a stream parameter to the worker tests, and then wrap them so they run twice, once with stream and once without.

But I feel this might not be the best way to do things, maybe I should focus on some vital tests? What do you suggest?

ajac-zero avatar May 08 '24 03:05 ajac-zero

@ajac-zero That's probably a great starting point, running current tests with "stream" set to false. Then running the test suite with "stream" set to True. This will require setting the source of the data to use Streams.

gaby avatar May 15 '24 13:05 gaby

@gaby I finally got around to writing the unit tests. I added a new stream_worker pytest fixture and used pytest parametrize to basically run all the worker test suite twice, first with polling and then with streaming. All passing 😸.

ajac-zero avatar Jun 13 '24 02:06 ajac-zero

This looks like great work, but I wonder if it will add complexity to the migration/rewrite required by #437.

Given the size of the change required by #437, I think I'll work on a clean-room rewrite, then add compatibility shims for the existing methods — this might make that work more complicated.

samuelcolvin avatar Jun 14 '24 12:06 samuelcolvin

@samuelcolvin Thanks! Yes, it might be better to wait and build on top of the new version.

I'm really looking forward to the new changes. The DAG and type safety sound awesome. Lmk if I can contribute in some way.

ajac-zero avatar Jun 16 '24 01:06 ajac-zero