E3SM icon indicating copy to clipboard operation
E3SM copied to clipboard

eamxx: Default vertical levels is still 72, and may want 128

Open ndkeen opened this issue 8 months ago • 6 comments

It's still default for certain tests (maybe ne4/ne30/ne120?) for scream to use 72 vertical levels.

SCREAM_NUM_VERTICAL_LEV 72

Do we want to make this 128? And if so, have some 72-level tests? ie we have eamxx-L128 modified, could have eamxx-L72.

ndkeen avatar May 12 '25 22:05 ndkeen

@crterai? I think the answer is "yes", right? We would want default 128 levels for our ne256 runs too. Should be a simple change, although we would need to discuss if we want to keep 72 levels for any of our nightly tests. If so we'll need to update the test mods.

AaronDonahue avatar May 14 '25 16:05 AaronDonahue

Yes, 128 levels is the default number of levels for EAMxx, so it makes sense to make that as the default. And I agree, we should check which tests still run 72 levels as default and make sure that our testing won't all start failing or diffing once we switch the defaults.

crterai avatar May 19 '25 15:05 crterai

@crterai Does it make sense to use 128 also at ne30?

bartgol avatar May 19 '25 19:05 bartgol

Does it make sense to use 128 also at ne30?

I think so, back when we used 72 layers to run ne30, we saw crashes with high wind speed at top of the model. https://acme-climate.atlassian.net/wiki/spaces/NGDNA/pages/3489497475/v1+ne30+High+wind+speeds+at+top+of+model It has been a while since then, but it sounds like the MAM4xx team might have encountered a similar issue recently? (pinging @kaizhangpnl) Eventually, as part of preparing for v4, we will be revisiting our vertical levels though.

crterai avatar May 19 '25 23:05 crterai

Yes, the ne30pg2L72 model with MAM4xx has encountered some instability issues:

  • Strong winds at model top as shown by homme post-condition check: https://acme-climate.atlassian.net/wiki/spaces/EAMXX/pages/5162696850/ne30pg2L72+MAM4xx+2025-03-22+ARIACI
  • Negative temperature as shown by homme post-condition check, with warning "WARNING:CAAR: dp3d too small.": https://acme-climate.atlassian.net/wiki/spaces/EAMXX/pages/5241766348/ne30pg2L72+MAM4xx+2025-03-22+ACI
  • "Bad dphi, dp3d, or vtheta_dp" in dynamics, with warning "WARNING:CAAR: dp3d too small". https://acme-climate.atlassian.net/wiki/spaces/EAMXX/pages/5219713039/ne30pg2L72+MAM4xx+2025-05-07

I haven't tested ne30pg2L128 yet, since we don't have a MAM4xx initial condition file for this resolution.

kaizhangpnl avatar May 19 '25 23:05 kaizhangpnl

I am not opposed to making 128 the default everywhere, as long as there is a reason for it, and it seems there may be one.

Our testing should still cover a different number, if nothing, at least for covering the case where the pack size (on CPU) or the warp size (on GPU) do not divide nlevs exactly.

bartgol avatar May 20 '25 14:05 bartgol

I can try to make a PR?

ndkeen avatar Sep 12 '25 19:09 ndkeen

Maybe we should take this opportunity to also update our L128 levels?

I have a suggestion for a replacement...

Image

whannah1 avatar Sep 12 '25 20:09 whannah1

Maybe we should take this opportunity to also update our L128 levels?

We definitely want to update our vertical levels and likely raise the model top when we do that. Since we're already planning on doing that in the coming year and changing the model levels might disrupt the ongoing preparation for Cess2, I think we should just keep it the same for now.

When we do make the change, I would be in favor of having a profile that's smoother like v2.2.

crterai avatar Sep 12 '25 21:09 crterai