EconML
EconML copied to clipboard
Performance deterioration with greater number of cores
I am fitting a CausalForestDML model on an Azure Machine Learning compute instance with 32 cores. My dataset has ~3 million records and ~60 variables. Surprisingly, fitting speed decreases with a greater number of cores. For instance, a model with n_jobs=-1 (i.e. all 32) cores runs slower than a model with n_jobs=12, which in turn runs slower than a model with n_jobs=6 (see attached images).



I am using the standard threading backend. Given the low overhead nature of threading, I am a little puzzled why parallelization would decrease fitting speed. I would be glad to get any insight regarding this behavior. Has anyone else experienced this issue? Could this have something to do with the compute instance itself?
Compute instance:
- Standard_F32s_v2 (32 cores, 64 GB RAM, 265 GB disk) details here
Environment:
- Linux, Ubuntu 18.04
- Python 3.8.10
- EconML 0.12.0