ufs-weather-model
ufs-weather-model copied to clipboard
Model does not reproduce with different blocksizes
Description
The model does not give bitwise identical results if blocksize is changed. Tried blocksize 32, 16, and 5 they do not reproduce. Tried with the FV3_RAP suite and the FV3_GFS_v17_p8 suite both show this problem. Tried compiler flag change from decomposition issue, it did not solve the issue. Tried turning do_ca = false, did not solve the issue.
To Reproduce:
What compilers/machines are you seeing this with? On Hera, intel compiler.
- Compiled and ran control_p8
- Copied control_p8 to control_p8_blocksize. Changed blocksize from 32 to 5. Ran, compared output with control_p8. Does not reproduce.
/scratch2/BMC/rem/Lisa.Bengtsson/stmp2/Lisa.Bengtsson/FV3_RT/SAVE_FOR_CA_BLOCKSIZE/control_p8_blocksize - blocksize 5 /scratch2/BMC/rem/Lisa.Bengtsson/stmp2/Lisa.Bengtsson/FV3_RT/SAVE_FOR_CA_BLOCKSIZE/control_p8 - blocksize 32
@pjpegion will provide additional testing below.
@junwang-noaa @bensonr @JessicaMeixner-NOAA @yangfanglin
In my test, I used the FV3_RAP suite.
I ran, outputting every time-step and saving the physics tendencies. The difference arises from differences in deep convection tendencies in the 2nd time-step. Surface fluxes, latent heat flux, dynamics tendencies are identical at this point.
I also tested with debug on, and it gave different results for different block sizes, although I did not save the tendencies in that run.
It is interesting since the FV3_RAP suite and the FV3_GFS_v17_p8 suite uses different convection schemes. @grantfirl could it be related to some generic convection routine in CCPP?
Last time this happened was early in the transition of GFSv15. At that time it was traced to a specific parameterization where a variable was being conditionally set. In that case, the variable was not given a default value, but set within a complex if-structure and then used outside of that if-structure. If any of the parameterizations are fortran-90 modules, it would also be good to understand how global variables are being set/used. This is just a few places to look.
@lisa-bengtsson Did you try any other SDFs? I ask because the commonality between these two is that they have Thompson MP enabled. I wonder if this issue is present in v16 physics?
Thanks @bensonr, it could perhaps be a good idea to have a blocksize test in the ORT's in the future @DeniseWorthen?
@dustinswales I will try v16. Good suggestion.
Me and @pjpegion found that the control_p8 test does reproduce with blocksize of 32 and 16, but not if you chose blocksize 5. blocksize 5 and 32 does however reproduce in conrol_debug_p8.
Both these tests are with do_ca = False because of a call to mpp_error when using non-uniform blocksizes when the do_ca namelist flag is true: https://github.com/ufs-community/ufs-weather-model/issues/1193 so if anyone would like to test reproducibility with a non-uniform blocksize in control_p8, I recommend setting do_ca to false until issue 1193 is resolved.
I also checked control (GFS_v16) and it also nodes not reproduce when changing blocksize from 32 to 5. But it also passed the block size test in debug mode.
Looking closely, the differences in both control and control_p8 occur in the 1st time step.
for control, I see a difference in the deep convection heating tendency, and in control_p8 I see a difference in the MP heating tendency, and also in the snow and water vapor mixing ratios at 1 gridpoint.
This reproducibility issue seems to be unrelated to the issue related to the GF convection scheme.
I think I know the reason for this. In the physics package in GSM based GFS arguments were general with e.g. "im, imx" etc, where physics operated on "im" points but the leading dimension of arrays were "imx" where im<=imx. This was retained in the IPD version of physics, but in the CCPP version this was removed assuming that in FV3 im ia always = imx. But when you use odd block size (like 5), for the last block this does not hold. So you can't use blocksize=5 in the current ccpp code. It will require adding an extra argument to all physics routines. Moorthi
On Sun, May 1, 2022 at 7:27 PM Shrinivas Moorthi - NOAA Federal < @.***> wrote:
I made 4 runs with a C384L127 coupled model with "blocksize" of "32", "16", "5" and "4". Three runs, except for blocksize=5, were identical. My guess is that all blocks should have the same size.
On Fri, Apr 29, 2022 at 1:37 PM Phil Pegion @.***> wrote:
I also checked control (GFS_v16) and it also nodes not reproduce when changing blocksize from 32 to 5. But it also passed the block size test in debug mode.
Looking closely, the differences in both control and control_p8 occur in the 1st time step. for control, I see a difference in the deep convection heating tendency, and in control_p8 I see a difference in the MP heating tendency, and also in the snow and water vapor mixing ratios at 1 gridpoint.
This reproducibility issue seems to be unrelated to the issue related to the GF convection scheme.
— Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/1198#issuecomment-1113560919, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALLVRYVYGA37VQ5B2NVK2ALVHQM4JANCNFSM5USVVNBQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718
e-mail: @.*** Phone: (301) 683-3718 Fax: (301) 683-3718
-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718
e-mail: @.*** Phone: (301) 683-3718 Fax: (301) 683-3718
@SMoorthi-emc Has there already been a fix for this issue? Or do we need a PR to fix it?
I wonder if @DomHeinzeller has an idea about this? I'm not so familiar with the horizontal loop index for odd blocksizes within ccpp.
I sent this to @junwang-noaa a few days after Moorthi's comment:
CCPP works with non-divisible blocksizes. But: for CCPP the following is true, and I thought this was the same for IPD (because in my memory IPD also allocated the last block to the actual block length im, not imx - we didn’t change anything to the allocation of the GFS DDTs GFS_sfcprop, ...):
- Results are the same across all runs that modify the blocksize as long as the blocksize is uniform, for example: - run the model with a blocksize of 24, then change it to 32 - as long as the blocksize is uniform (e.g. if the total number of gridpoints was imx = 96), results are the same
- Results are the same from run to run if the blocksize is non-uniform, but the results differ from (all the) uniform runs. For example: - run the model with a blocksize of 7, let’s say the blocksize is non-uniform with the last block being of size 5 (imx = 96) - results will remain the same as long as the blocksize doesn’t change (i.e. restart runs, omp threading, …) - results will be different from the runs with uniform blocksizes (24, 32 in the above example)
This has to do with AVX2 I believe … in debug mode it should be reproducible between uniform and non-uniform blocksizes.
Look for logic around “non_uniform_blocks “ in atmos_model.F90 and CCPP_driver.F90.
Didn't you remove the AVX2 flags in the last several months? If so, then maybe it's all good now.
@climbfuji I will redo the tests and see if we can close the issue
@climbfuji @DeniseWorthen I tested "control" test (GFSv16) and now blocksize 32 and blocksize 5 are reproducible. This is perhaps enough to close the issue? Or do you want me to also test coupled prototype 8 to be sure?
For good measure I tried also the cpld_control_p8 test which also now reproduces between blocksize 32 and blocksize 5. You can see the test directories here:
For GFSv16: /scratch2/BMC/rem/Lisa.Bengtsson/stmp2/Lisa.Bengtsson/FV3_RT/TEST_BLOCKSIZE_GFSv16/control /scratch2/BMC/rem/Lisa.Bengtsson/stmp2/Lisa.Bengtsson/FV3_RT/TEST_BLOCKSIZE_GFSv16/control_blocksize_5
For UFS coupled prototype 8: /scratch2/BMC/rem/Lisa.Bengtsson/stmp2/Lisa.Bengtsson/FV3_RT/TEST_BLOCKSIZE_CPLD_P8/cpld_control_p8 /scratch2/BMC/rem/Lisa.Bengtsson/stmp2/Lisa.Bengtsson/FV3_RT/TEST_BLOCKSIZE_CPLD_P8/cpld_control_p8_blocksize5
@DeniseWorthen @junwang-noaa @JessicaMeixner-NOAA we can close this issue and remove it from the UFS coupled prototype Wednesday tag-up notes.
@lisa-bengtsson Thanks for the extra effort of testing this in cpld_control_p8.
I made 4 runs with a C384L127 coupled model with "blocksize" of "32", "16", "5" and "4". Three runs, except for blocksize=5, were identical. My guess is that all blocks should have the same size.
On Fri, Apr 29, 2022 at 1:37 PM Phil Pegion @.***> wrote:
I also checked control (GFS_v16) and it also nodes not reproduce when changing blocksize from 32 to 5. But it also passed the block size test in debug mode.
Looking closely, the differences in both control and control_p8 occur in the 1st time step. for control, I see a difference in the deep convection heating tendency, and in control_p8 I see a difference in the MP heating tendency, and also in the snow and water vapor mixing ratios at 1 gridpoint.
This reproducibility issue seems to be unrelated to the issue related to the GF convection scheme.
— Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/1198#issuecomment-1113560919, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALLVRYVYGA37VQ5B2NVK2ALVHQM4JANCNFSM5USVVNBQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718
e-mail: @.*** Phone: (301) 683-3718 Fax: (301) 683-3718
If your runs reproduces in debug mode it may be related to the "AVX2" compiler flag that Dom described above?
Hi Lisa, Something strange is going on in google mail. Old mails are being sent as new. I did not make any comment 6 hours ago. Moorthi
On Tue, Oct 11, 2022 at 9:10 AM lisa-bengtsson @.***> wrote:
If your runs reproduces in debug mode it may be related to the "AVX2" compiler flag that Dom described above?
— Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/1198#issuecomment-1274663527, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALLVRYXT3G7R7XPJEFMFLBTWCVRKDANCNFSM5USVVNBQ . You are receiving this because you were mentioned.Message ID: @.***>
-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718
e-mail: @.*** Phone: (301) 683-3718 Fax: (301) 683-3718
Ok, yes, I noticed some strange updates in my inbox as well. Hopefully that gets solved quickly.
That is a relief, I thought you, and a bunch of other EMC people were working all night. -Phil
On Tue, Oct 11, 2022 at 7:54 AM SMoorthi-emc @.***> wrote:
Hi Lisa, Something strange is going on in google mail. Old mails are being sent as new. I did not make any comment 6 hours ago. Moorthi
On Tue, Oct 11, 2022 at 9:10 AM lisa-bengtsson @.***> wrote:
If your runs reproduces in debug mode it may be related to the "AVX2" compiler flag that Dom described above?
— Reply to this email directly, view it on GitHub < https://github.com/ufs-community/ufs-weather-model/issues/1198#issuecomment-1274663527 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ALLVRYXT3G7R7XPJEFMFLBTWCVRKDANCNFSM5USVVNBQ
. You are receiving this because you were mentioned.Message ID: @.***>
-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718
e-mail: @.*** Phone: (301) 683-3718 Fax: (301) 683-3718
— Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/1198#issuecomment-1274726528, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJIRVJHCCXDSV777734U6GDWCVWSTANCNFSM5USVVNBQ . You are receiving this because you were mentioned.Message ID: @.***>
-- Phil Pegion (he/him/his) Physical Scientist NOAA/Physical Sciences Laboratory (303) 497-7897 @.***