blis icon indicating copy to clipboard operation
blis copied to clipboard

Implement use of pthread affinity functions

Open fgvanzee opened this issue 9 years ago • 4 comments

Read an environment variable, say, BLIS_CPU_AFFINITY, and use its contents to call pthread_setaffinity_np() to set the threads' affinity masks.

Ideally, the same environment variable would control OpenMP thread affinity, in the event that BLIS is configured with OpenMP instead of pthreads, but there may be implementation realities that make this infeasible. Reader: please consider this issue a request for comment.

fgvanzee avatar Nov 22 '16 22:11 fgvanzee

sched_setaffinity is probably the way to go on Linux. Although, you really also need HW topology information, which is probably best gotten from hwloc.

devinamatthews avatar Nov 22 '16 22:11 devinamatthews

And FYI you can't set affinity at all on OSX so Linux-only is fine.

devinamatthews avatar Nov 22 '16 22:11 devinamatthews

I'm looking for the similar features too and considering implementing it with hwloc.

However, I found the performance drop severely when multithread is enabled (on ARMv8). I used perf to analysis the issue and found most of time it consumed happens at the while loop in bli_thrcomm_barrier(). Need to fix this first.

baozich avatar Feb 07 '17 03:02 baozich

@baozich What happens if you use the OpenMP build of BLIS and set affinity via OMP_PLACES and OMP_PROC_BIND?

jeffhammond avatar Feb 07 '17 17:02 jeffhammond