implicit
implicit copied to clipboard
Optimal configuration
"For systems using OpenBLAS, I highly recommend setting 'export OPENBLAS_NUM_THREADS=1'. This disables its internal multithreading ability, which leads to substantial speedups for this package. Likewise for Intel MKL, setting 'export MKL_NUM_THREADS=1' should also be set."
As far as I understand the disabling of internal multi-threading may speedup training. However, during inference MKL_NUM_THREADS should be set to maximum (= number of physical CPU cores). Is this true?