CTSM icon indicating copy to clipboard operation
CTSM copied to clipboard

Perturbed Parameter Experiment (PPE) branch

Open ekluzek opened this issue 5 years ago • 17 comments

Description of changes

This is the start of the branch for the Perturbed Parameter Experiment work. It includes the Arctic LUNA work with Kattge from @lmbirch, as well as work on more parameters from @olyson. It also will include the CN-Matrix and soil matrix solution work PR #640.

See many notes on the parameters that will be experimented with:

https://docs.google.com/spreadsheets/d/1OtkaO_uAmafWKR9kgtRC2Ge6d6fkhymngSpben5SJ_Q

Specific notes

Contributors other than yourself, if any: @lmbirch, @wwieder @olyson @djk2120

Many others are contributing to the simulation framework and scripting.

Are answers expected to change (and if so in what way)? Yes (ctsm5_1)

Any User Interface Changes (namelist or namelist defaults changes)? New params file

Testing performed, if any: The individual branches have had standard testing run on them.

ekluzek avatar Oct 29 '20 22:10 ekluzek

@ekluzek my understanding is that the individual branches making up the PPE branch are each going to come to master separately, rather than the PPE branch as a whole being merged. Is that correct? (I guess it would also work to add some commits directly to the PPE branch, to eventually merge to master, but it would be awkward to bring those in until all of the underlying branches are first merged to master.)

billsacks avatar Oct 29 '20 22:10 billsacks

@billsacks yes that is correct. I just made this a PR as it makes it visible.

ekluzek avatar Oct 29 '20 23:10 ekluzek

Sounds good. This will also provide a nice confirmation: Once all of the individual branches are merged to master, then I think this PR should show that there are no changes on this branch relative to master.

billsacks avatar Oct 29 '20 23:10 billsacks

Testing on cheyenne ran as expected with everything identical except the clm5_1 tests.

ekluzek avatar Oct 30 '20 17:10 ekluzek

woo!

On Fri, Oct 30, 2020 at 11:05 AM Erik Kluzek [email protected] wrote:

Testing on cheyenne ran as expected with everything identical except the clm5_1 tests.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ESCOMP/CTSM/pull/1199#issuecomment-719677067, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5IWJCORZWWMEFP4BLD7VLSNLW4BANCNFSM4TEKIUJA .

wwieder avatar Oct 30 '20 19:10 wwieder

With latest updates everything is working with the exception of this tracer test where tracers don't match bulk...

LWISO_Ld10.f10_f10_musgs.I2000Clm50BgcCrop.cheyenne_gnu.clm-coldStart

ekluzek avatar Feb 03 '21 06:02 ekluzek

The changes that are causing the LWISO problems is latest commit 1620d027149bbfe238dcdab0648beeed6ad4b0c9. These include changes in WaterDiagnosticBulkType.F90 for 5-day snow height. When I checkout the version just before the above commit in src, the LWISO test passes. So likely a version without the diagnostic bulk changes would function correctly.

ekluzek avatar Feb 03 '21 08:02 ekluzek

@ekluzek from poking around in your directories, I see this:

69:ERROR in CompareBulkToTracer: tracer does not agree with bulk water
69:Called from: after downscale_forcings
69:Variable: ice1_grc
69:First difference at index: 271
69:Bulk  :  -0.11072109471515731-312
69:Tracer:  -0.14821969375237396-322
69:ratio:   0.10000000000000000E-09
69:Bulk*ratio:  -0.98813129168249309-323

Is that the error you're seeing? If so, it looks like there are some garbage values somewhere, even for the bulk water. The commit you pointed to looks like it's just adding a couple of restart variables, which shouldn't have any impact on this test because it's a cold start test that doesn't do any restart.

billsacks avatar Feb 03 '21 16:02 billsacks

Just to add a bit to the explanation: it's not surprising that the tracer consistency test fails for numbers this tiny: the number of representable digits in a double precision number begins to decrease for values smaller than 10^-308. So the fundamental issue here seems to be that there is some tiny value for total gridcell ice; I assume this is an error. Let me know if you'd like help tracking this down.

billsacks avatar Feb 03 '21 16:02 billsacks

Yes, that is the error. I'll get started on it, and we can talk more tomorrow.

ekluzek avatar Feb 03 '21 17:02 ekluzek

I've updated to cesm5.1.dev023 now and all tests are passing as expected for aux_clm on both cheyenne and izumi. Some tests show differences to the ctsm5.1.dev023 baselines. clm51 because of the Leah arctic changes and paramfile changes. clm45 and clm50 appear to have answer changes only because of new diagnostic fields (the iWUE and VPD_2m fields). So this is all as expected.

ekluzek avatar Feb 18 '21 23:02 ekluzek

@wwieder and @slevisconsulting commit 0e7aa4dc5d5d6e545451d13f1585b573952c97db brings in the changes for CWD HR. If you'd like to look it over to validate that would be great. The test seems to work for both matrix on and matrix off.

ekluzek avatar Jul 01 '21 15:07 ekluzek

I'm working on updating this to work on Derecho. So I have changes that will go on the branch, and then I'll tag it and push the tag to escomp.

I could create a PR to this PR for visibility. Since we need to make a few tags on PPE, I think I'll go ahead and do that. This means doing a PR in my fork, rather than in ESCOMP.

ekluzek avatar Jan 24 '24 16:01 ekluzek

Two caveats on the update to Derecho for PPE: threading is dog slow and shouldn't be used, so we won't test it and the gnu compiler is unavailable for Derecho for the versions of externals we'll be using here.

When the rest of the needed changes on the PPE come to main (CN-Matrix, and a few source mod changes) these things will be working there. So these are just issues for this older branch.

Also we will close this PR once, those changes come to main. The branch will still exist, but people will have to know about it rather than find it in a PR.

ekluzek avatar Jan 24 '24 17:01 ekluzek

One more caveat -- the NUOPC tests are failing. So I am going to remove them and have NUOPC broken on the branch.

ekluzek avatar Jan 24 '24 19:01 ekluzek