stanza icon indicating copy to clipboard operation
stanza copied to clipboard

Low performance in many-cores systems

Open StarTessar opened this issue 4 years ago • 1 comments

Describe the bug When Stanza run from docker container at the server with more then ~20 cores - performance of the pipeline falling dramatically.

To Reproduce

  1. Get machine with big count of CPU;
  2. Build and run docker contaier with Stanza;
  3. Run Stanza pipeline with language lv, processors='tokenize,pos,lemma' (for example);
  4. Get predictions for the usually-length text (1-4k characters);
  5. Measure performance.

Expected behavior Performance must grow with parallelizing on cores or be the same at least. Possibility for set the cores/threads limit will be good feature.

Environment (please complete the following information):

  • OS: Ubuntu 18.04/20.04 in container; CentOS on the machine
  • Python version: Python 3.6/3.8
  • Stanza version: 1.0.0; 1.0.1; 1.1.1

Additional context We are build docker containers for our purposes, but the same behavior will be reproduced on the system interpreter level, as i think. Our case was more bad by use the kubernetes, where all cores of the node are visible for python, but CPU usage is limited. When run on the machine in docker as usually, problem was be the same, but less worstly. Core of the problem probably can be in the Torch. We solved this with strong set environment variable "OMP_NUM_THREADS", but if user need to use framework in other places it can be not good practice.

StarTessar avatar Oct 21 '20 09:10 StarTessar

Hi @StarTessar, thanks for reporting this! This may due to the nature of PyTorch parallelization, as you say the problem is solved by explicitly declaring OMP_NUM_THREADS in the program. I think this is a great solution, do you have any other suggested solution about this?

yuhui-zh15 avatar Oct 26 '20 17:10 yuhui-zh15