streamly icon indicating copy to clipboard operation
streamly copied to clipboard

Use thread affinity to utilize per cpu cache locality

Open harendra-kumar opened this issue 5 years ago • 1 comments

Currently we have a shared-nothing concurrency support where concurrent threads share a global read-only state. However, we can modify the state in different threads and combine the changes using a monoid, (see #247).

For CPU bound tasks we can use only as many threads as the number of CPUs. We can then have a per CPU state such that the state is modified in a lockless manner by threads running on a particular CPU. When resuming a new thread using cached state, we can select the state by CPU and set the affinity of the thread to that particular CPU, that way we can make use of cache locality.

This is an experimental idea, need to see which use cases it can help, and have benchmarks to see if it actually helps.

harendra-kumar avatar Jun 20 '20 13:06 harendra-kumar

We can use bound threads and mutable ST state indexed by the thread-generation (a non-reusable, ever-increasing thread-id). Also need to make sure that FFI does not mess up with it.

harendra-kumar avatar Feb 17 '25 07:02 harendra-kumar