scream icon indicating copy to clipboard operation
scream copied to clipboard

ne30 packsize=1 crash with NH on

Open ambrad opened this issue 3 years ago • 7 comments

After NH mode was turned on, I started seeing

113: [DIRK] WARNING! Newton reached max iteration count, with deltaerr = nan

immediately in ne30 runs on the CPU with pack size set to 1.

ambrad avatar Jun 06 '22 23:06 ambrad

This is on Chrysalis. With Intel, I see this error for n30 with out-of-the-box settings except for pack size 1. With GNU, I haven't yet been able to reproduce this error; every configuration I've tried runs without a problem.

ambrad avatar Jun 07 '22 17:06 ambrad

I was also going to report this, but wanted to first try repeating on cori-knl with Intel. I can run with Intel using default packsize.

Yep, same error on cori-knl with Intel. /global/cscratch1/sd/ndk/e3sm_scratch/cori-knl/se08-jun6/f30cpu.F2010-SCREAMv1.ne30_ne30.se08-jun6.intel.24s.n011b64x2.pack1.dd

6402: forrtl: error (76): Abort trap signal
6402: Image              PC                Routine            Line        Source
6402: e3sm.exe           0000000003E1F2F4  Unknown               Unknown  Unknown

@ndkeen

ndkeen avatar Jun 07 '22 17:06 ndkeen

Just to add -- i saw this error when i was running unstable namelist in homme. Yesterday, i opened logs for some of summit runs and did not see messages like this. Are they in e3sm log?

oksanaguba avatar Jun 07 '22 17:06 oksanaguba

Yes, they are in e3sm.log. On Summit, you're using GNU for all runs, right? GNU is fine so far on Chrysalis.

ambrad avatar Jun 07 '22 17:06 ambrad

Intel debug build, all other settings the same, runs without a problem.

ambrad avatar Jun 07 '22 17:06 ambrad

How can I turn off NH to use what we had before?

ndkeen avatar Jun 14 '22 17:06 ndkeen

./atmchange theta_hydrostatic_mode=False

ambrad avatar Jun 14 '22 18:06 ambrad