global-workflow icon indicating copy to clipboard operation
global-workflow copied to clipboard

Remove `MPICH_COLL_OPT_OFF` eupd runtime setting in `WCOSS2.env`

Open KateFriedman-NOAA opened this issue 1 year ago • 0 comments

Description

The MPICH_COLL_OPT_OFF=1 setting in the eupd job on WCOSS2 generates a warning at runtime (but doesn't kill the job): MPICH Warning: Conflicting env variables. Cannot use Shared Memory aware collectives

This setting was added during the WCOSS2 GFSv16 operational port after being recommended by GDIT and GSI code manager to ensure reproducibility with the global_enkf.x executable.

https://github.com/NOAA-EMC/global-workflow/blob/develop/env/WCOSS2.env#L163

Quick testing by @lgannoaa (removing this setting) showed bitwise identical output compared to the job run with the setting.

Run additional tests with and without this setting to further confirm reproducibility and remove the setting from WCOSS2.env if tests pass.

Requirements

Output from the enkfgdas_update job must be reproducible without MPICH_COLL_OPT_OFF=1 set.

Acceptance Criteria (Definition of Done)

Output from the enkfgdas_update job is reproducible without MPICH_COLL_OPT_OFF=1 set.

KateFriedman-NOAA avatar Oct 05 '22 18:10 KateFriedman-NOAA