ex_cpu - hwloc - CPUKind (P vs E) detection
Apple M-series and Intel Hybrid cores have both Performance and Efficiency CPU cores. hwloc exposes info about the core kinds: https://hwloc.readthedocs.io/en/stable/group__hwlocality__cpukinds.html
Detect these and create groups for them. Expose a user API to allow the user to specify that tasks should run on a particular kind of core.
Should there be a heuristic to allocate certain tasks to E cores, such as low priority tasks?
Should we avoid scheduling anything on E cores otherwise? Some benchmarking is required to determine at what level of parallelism it is helpful to include E cores in the parallel group.
One possible implementation came to my mind:
If hwloc is present, we can define:
tmc::ex_cpu::set_priority_threshold(size_t both, size_t high);
Then, when queuing a task:
If priority < both, the task can be run only on E-cores.
If priority >= both && priority < high, it can be run on any core.
If priority >= high, it can be run only on a P-core.
The defaults would be 0, type_max(size_t), so that any task can be run on any core (current behaviour).