dorado icon indicating copy to clipboard operation
dorado copied to clipboard

default_thread_allocations should not use std::thread::hardware_concurrency()

Open jbd opened this issue 11 months ago • 1 comments

Hello,

In the default_thread_allocations function, C++ hardware_concurrency is used to determine the maximum number of threads.

https://github.com/nanoporetech/dorado/blob/9dc15a8564d0bbe6cff710770b13c14816a2e3a7/dorado/utils/parameters.cpp#L13

std::thread::hardware_concurrency() returns, when possible, the underlying hardware capability to run threads, which might not corresponds to the actual number of cores available to the process (through the use of taskset, batch system like slurm, etc...). The consequence is that the program might run in a non optimal way. For example, if I run taskset -c 1 on my 20 cores machines, it will run 20 workers on only one core.

Consider this code run through the slurm scheduler using srun, requesting 4 cores:

#include <iostream>
#include <sched.h>
#include <thread>  
 
int main (void) {
    cpu_set_t cpu_set;
    int cores = std::thread::hardware_concurrency();
    std::cout << "using std::thread::hardware_concurrency: " << cores << "\n";
    sched_getaffinity(0, sizeof(cpu_set), &cpu_set);
    cores = CPU_COUNT(&cpu_set);
    std::cout << "using sched_getaffinity: " << cores << "\n";
    return 0;
}
$ srun -c 4 ./get_cores
srun: job 13694693 queued and waiting for resources
srun: job 13694693 has been allocated resources
using std::thread::hardware_concurrency: 96
using sched_getaffinity: 4

I think a more sane behavior would be to use sched_getaffinity as in the folly library for example (folly/folly/system/HardwareConcurrency.cpp)

#include <folly/system/HardwareConcurrency.h>

#include <thread>

#include <folly/portability/Sched.h>

namespace folly {

unsigned int hardware_concurrency() noexcept {
#if defined(__linux__) && !defined(__ANDROID__)
  cpu_set_t cpuset;
  if (!sched_getaffinity(0, sizeof(cpuset), &cpuset)) {
    auto count = CPU_COUNT(&cpuset);
    if (count != 0) {
      return count;
    }
  }
#endif

  return std::thread::hardware_concurrency();
}

} // namespace folly

What do you think ?

Thank you for considering this change !

jbd avatar Jan 20 '25 20:01 jbd

Hi @jbd,

Thanks for the info - this certainly looks like a worthwhile change for users who take more active control over their CPU resources. We'll look into it for a future release.

malton-ont avatar Jan 22 '25 10:01 malton-ont