pytracking
pytracking copied to clipboard
CPU utils occupies a lot when inference
Hi, when inferencing, I notice that the program occupies a lot of CPU load.
Before running the program:
After running the program:
set the thread as 1
set the thread as 30
It seems that there is no improvement.
Is there any way to reduce CPU utils?
Hm didn't realize that before. I currently don't know what is causing this. Let me know if you found the issue.
@sfchen94 - Which tracker are you running?
@srama2512 Dimp50, but I guess other trackers may have the similar problem
BTW, this problem only occurs when inferencing, The training stage has no this kind of problem.
@sfchen94 - Got it. I'm noticing high CPU usage during inference with KYS tracker as well. The GPU usage is quite low.
Hm I am still not sure why this is happening. Maybe it is related to opencv
. What helps to reduce the load on the CPUs is limiting the number of CPUs that can be used by the python script with taskset --cpu-list 0-1
this limits the usage to two cores. So running for example taskset --cpu-list 0-1 python run_tracker.py tomp tomp50 lasot
reduces the CPU workload without decreasing the FPS of the tracker but since this is not measuring the data loading time the overall throughput might be lower. Maybe @goutamgmb has an idea?
@2006pmach Cool. It works! But why can it reduce CPU but still can have the same FPS 😆
So to compute the FPS we only measure the time that the tracker takes here namely the call out = tracker.track(image, info)
everything else is not measured to compute the FPS. So it could be that the overall runtime of the scripts is higher now since for example the data loading time could be increased (but this is not reflected in the FPS). I did't check this though. For me it is still not clear what is causing the high CPU load and what these cores are doing exactly...
@2006pmach - Thanks for the taskset solution. It appears to be working right now. I restricted the CPU usage to 0-39 in my 80-core cluster machine. Interestingly, I'm observing that more kernel threads (red) are occupying the CPU load when compared to normal threads (green). Is this suggestive of anything specific to you?
Yes, the real problem was not solved. For example, I have 40 CPU cores in total. Initially, the program needs 50% CPU loading. But when I force it to use CPU #1-10. It definitely occupies a maximum of 25% CPU loading.
But actually, this case assigned 50% CPU loading to 25% CPU core. The program still needs the same CPU loading after we use taskset.
Hi. I think the following code may help you to solve this issue. In my case, the CPU occupation can be reduced by inserting these code, and the inference speed can also be improved a little.
import torch
cpu_num = 8 # Num of CPUs you want to use
os.environ['OMP_NUM_THREADS'] = str(cpu_num)
os.environ['OPENBLAS_NUM_THREADS'] = str(cpu_num)
os.environ['MKL_NUM_THREADS'] = str(cpu_num)
os.environ['VECLIB_MAXIMUM_THREADS'] = str(cpu_num)
os.environ['NUMEXPR_NUM_THREADS'] = str(cpu_num)
torch.set_num_threads(cpu_num)
Hi. I think the following code may help you to solve this issue. In my case, the CPU occupation can be reduced by inserting these code, and the inference speed can also be improved a little.
import torch cpu_num = 8 # Num of CPUs you want to use os.environ['OMP_NUM_THREADS'] = str(cpu_num) os.environ['OPENBLAS_NUM_THREADS'] = str(cpu_num) os.environ['MKL_NUM_THREADS'] = str(cpu_num) os.environ['VECLIB_MAXIMUM_THREADS'] = str(cpu_num) os.environ['NUMEXPR_NUM_THREADS'] = str(cpu_num) torch.set_num_threads(cpu_num)
This method can actually ease the CPU utils, so I temporally close this issue.