coz
coz copied to clipboard
Inconsistent results for throughput and latency profiling
While experimenting with a simple Intel TBB program (that checks for prime numbers in parallel) I noticed a difference in the projected speedup of the program while profiling for throughout and latency. While profiling for throughput, I placed a progress point at the end of a loop. The profile highlights a line in the program and projects an almost linear increase in speedup (see attached screenshot). But while profiling for latency, I placed progress points at the beginning and the end of loop. Coz highlights the same line as the throughput profile, but shows a decrease in the projected program speedup. In the attached screenshot, primes1 refers to the progress points for latency and detect_primes_tasks.cpp:53 refers to the progress point for throughput.
I expected both the latency and the throughput to increase since the highlighted link is within a region which is indeed the bottleneck in the program. Any idea on why I would be seeing different results? I have attached the TBB program (as a text file) that I used for testing.
detect_primes_tasks.txt
Possibly explained by https://youtu.be/koTf7u0v41o?t=2724