racecar icon indicating copy to clipboard operation
racecar copied to clipboard

Thread based concurrency

Open bestie opened this issue 2 years ago • 3 comments

Concurrently runs a number of 'Racecar::Runner' instances in a fixed size thread pool.

Each thread starts a single Racecar::Runner with Racecar::Consumer class instance. All threads run the same consumer class, have the same config and consume partitions from the same topic(s).

ThreadPoolRunnerProxy can be combined with ParallelRunner, to run forks and threads. ParallelRunner is not used or battle tested (at Zendesk) and so this is still not a recommended thing to do.

Racecar does not yet implement a health-check mechanism so an uncaught error in a single worker thread will cause the process to start a graceful shutdown of all threads before exiting with an error.

Other inclusions:

  • Signal handling has been moved up one level to the CLI
  • Runner-like object interface standardized to #run #stop #running?
  • Tests can be run locally without Docker, export LOCAL=1
  • Some test bugs have been fixed, connections now always close and orphaned processes raise an exception
  • Suggest that 'parallel' be deprecated in favor of fork/forking next to thread/threaded for ease of understanding

bestie avatar Jan 25 '23 21:01 bestie

My pleasure!

That sounds a little more complex but totally doable. This is very much MVP concurrency, changing as little as existing code as possible.

Unless someone already has that work in progress, I can give it a try.

Is this enough an improvement to consider merging? If so, I think we could iterate towards your proposed design without breaking the API. It should also be marked experimental anyway.

bestie avatar Feb 01 '23 13:02 bestie

I'm not sure it provides enough value as it stands – with copy-on-write, the forking approach seems better if we're using separate consumer instances anyway.

I think the shared-main-loop approach should be pretty doable; if you look into Runner, you can see how batches are being processed sequentially – we would change that to an async model, with a fixed partition number -> worker thread mapping probably.

dasch avatar Feb 01 '23 13:02 dasch

79sthj

bestie avatar Feb 03 '23 09:02 bestie