gtsam icon indicating copy to clipboard operation
gtsam copied to clipboard

TimeTBB slower with more threads

Open mfinean opened this issue 4 years ago • 0 comments

Running the example TimeTBB I get the following results (faster using 1 thread). I'm currently looking into why but any insight in the meantime would be greatly appreciated.

numberOfProblems = 1000000 problemSize = 4 With 1 threads: Without memory allocation, grain size = 1, time = 0.195355 Without memory allocation, grain size = 10, time = 0.194243 Without memory allocation, grain size = 100, time = 0.194409 Without memory allocation, grain size = 1000, time = 0.196936 With memory allocation, grain size = 1, time = 0.289234 With memory allocation, grain size = 10, time = 0.295508 With memory allocation, grain size = 100, time = 0.294618 With memory allocation, grain size = 1000, time = 0.29145

With 4 threads: Without memory allocation, grain size = 1, time = 5.02581 Without memory allocation, grain size = 10, time = 4.9835 Without memory allocation, grain size = 100, time = 4.74276 Without memory allocation, grain size = 1000, time = 5.06713 With memory allocation, grain size = 1, time = 4.6808 With memory allocation, grain size = 10, time = 4.73614 With memory allocation, grain size = 100, time = 4.77174 With memory allocation, grain size = 1000, time = 4.75051

With 8 threads: Without memory allocation, grain size = 1, time = 4.00496 Without memory allocation, grain size = 10, time = 4.06559 Without memory allocation, grain size = 100, time = 4.06971 Without memory allocation, grain size = 1000, time = 4.06233 With memory allocation, grain size = 1, time = 4.65368 With memory allocation, grain size = 10, time = 4.6617 With memory allocation, grain size = 100, time = 4.65855 With memory allocation, grain size = 1000, time = 4.65979

Summary of results: 4 threads, without allocation, grain size = 1, speedup = 0.0388704 4 threads, without allocation, grain size = 10, speedup = 0.0389772 4 threads, without allocation, grain size = 100, speedup = 0.0409907 4 threads, without allocation, grain size = 1000, speedup = 0.0388654 4 threads, with allocation, grain size = 1, speedup = 0.0617917 4 threads, with allocation, grain size = 10, speedup = 0.0623943 4 threads, with allocation, grain size = 100, speedup = 0.0617423 4 threads, with allocation, grain size = 1000, speedup = 0.0613514 8 threads, without allocation, grain size = 1, speedup = 0.0487782 8 threads, without allocation, grain size = 10, speedup = 0.0477773 8 threads, without allocation, grain size = 100, speedup = 0.0477697 8 threads, without allocation, grain size = 1000, speedup = 0.0484786 8 threads, with allocation, grain size = 1, speedup = 0.0621517 8 threads, with allocation, grain size = 10, speedup = 0.0633907 8 threads, with allocation, grain size = 100, speedup = 0.0632425 8 threads, with allocation, grain size = 1000, speedup = 0.0625457

mfinean avatar May 13 '20 16:05 mfinean