saxpy-benchmark icon indicating copy to clipboard operation
saxpy-benchmark copied to clipboard

possible error in omp implementation

Open Try2Code opened this issue 6 years ago • 0 comments

https://github.com/bennylp/saxpy-benchmark/blob/fb811ad7a5ac43d53948ca94357e209bbae6a6ed/src/saxpy_omp.cpp#L22

hi! I think this for loop is not correct because

  • the loop counter is set to start with thread ID - I see no reason why I should write a loop like this
  • the #pragma omp parallel for is missing - this lead to version essentially being a cpu version. infact, their speed is very similar. on my machine

I am currently working on a version with GPU offloading (clang-9, gcc-8)

Try2Code avatar May 11 '19 13:05 Try2Code