parallelly icon indicating copy to clipboard operation
parallelly copied to clipboard

detectThreadsPerCore(): infer how many threads per cores the machine's CPU provides

Open HenrikBengtsson opened this issue 3 years ago • 0 comments

Wish

For example, for a machine with:

$ lscpu | grep -iF "core"
Thread(s) per core:  2
Core(s) per socket:  4
Model name:          Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz

we should get:

> detectThreadsPerCore()
[1] 2

Action

  • [ ] Figure out how to do this reliably on all operating systems and Unix distributions.
  • [ ] If it cannot be inferred, return default, which could default to either 1L ("always works") or NA_integer_ ("unknown"). Alternatively, there could be a mustWork argument or similar.

Background

This will be useful when limiting multi-threading in nested parallelization. For example, in future we try to force single-processing by setting various environment variables and R options in parallel workers, e.g. MC_CORES=1. Currently, we do not limit the number of parallel threads.

The simplest approach would be to force a single thread, but it would be more efficient on modern systems if we'd limit it to detectThreadsPerCore() threads, e.g.

RhpcBLASctl::blas_set_num_threads(threads = detectThreadsPerCore())
RcppParallel::setThreadOptions(numThreads = detectThreadsPerCore())

HenrikBengtsson avatar Apr 04 '21 17:04 HenrikBengtsson