EKAT
EKAT copied to clipboard
Potential problem with ExeSpaceUtils view_reduction and parallel_reduce
Describe the bug This was discovered when porting shoc_energy_integrals to small kernels. I was getting large differences in the outputs of the view_reductions when num_threads>1. I suspect the problem is in the handling of the garbage of the last pack because the problem went away when I used nlev % pack_size = 0.
To Reproduce Steps to reproduce the behavior:
- Switch shoc_energy_integrals to the implementation it had before the small kernel PR. The one that uses view_reductions.
- Build SCREAM with
-DSCREAM_SMALL_KERNELS=On -DCMAKE_BUILD_TYPE=Debug
- run
OMP_NUM_THREADS=16 ./shoc_tests shoc_main_bfb
- This should fail due to being non_bfb with fortran. You can add print statements to confirm that the se_int, ke_int, wv_int, and wl_int values do not match fortran, which causes different results later in shoc for the output views.
Expected behavior view_reduction should have produced bfb results with fortran.