NTUSTCFDLAB icon indicating copy to clipboard operation
NTUSTCFDLAB copied to clipboard

Data race in solver parallelised reduction

Open rookiehpc opened this issue 2 years ago • 0 comments

In the file solver.f90 at lines 81 and 125 appears this block:

!$OMP PARALLEL DO 
do i=1,n; norm=norm+r(i)*r(i); end do
!$OMP END PARALLEL DO

There is a data race here as threads handling different iterations of the reduction loop are trying to update the same variable concurrently. In order to restore correctness, the OpenMP reduction clause can be used, it will ensure that each thread sum its r(i)*r(i) increments into a thread-local temporary variable. At the end of the loop, OpenMP will make those threads sum the final value of their thread-local temporary variable back into the original norm variable, taking care of potential data races in the process. The corrected block would look as follows:

!$OMP PARALLEL DO REDUCTION(+:norm)
do i=1,n; norm=norm+r(i)*r(i); end do
!$OMP END PARALLEL DO

There might be other blocks in a similar situation, for which the same fix applies, such as the one at line 113 in the same file solver.f90:

!$OMP PARALLEL DO 
do i=1,n; x(i)=x(i)+alpha*p(i)+omega*ss(i); end do
!$OMP END PARALLEL DO

You can refer to https://rookiehpc.github.io/openmp/docs/reduction/index.html for more information.

rookiehpc avatar Jul 16 '22 11:07 rookiehpc