CAM icon indicating copy to clipboard operation
CAM copied to clipboard

Update CSLAM Derecho performance issues in 2.2 release tag

Open adamrher opened this issue 1 year ago • 5 comments

Issue Type

Other (please describe below)

Issue Description

Derecho performance issues for CSLAM were resolved on the trunk https://github.com/ESCOMP/CAM/pull/845? but not on the 2.2 release tag. This is needed for running CSLAM with the 2.2 release tag.

There is also an issue with i/o via the mpich compiler, which will be resolved with a new 2.2 cime tag @fvitt. Ideally that tag wold be made first and would update the externals in this issue as well.

Will this change answers?

I Don't Know

Will you be implementing this yourself?

Any CAM SE can do this

adamrher avatar Jan 09 '24 16:01 adamrher

Should we include the fix to this issue as well? https://github.com/ESCOMP/CAM/issues/876 See PR https://github.com/ESCOMP/CAM/pull/878

fvitt avatar Jan 09 '24 17:01 fvitt

This is my cime PR: https://github.com/ESMCI/cime/pull/4559 And cime branch: https://github.com/fvitt/cime/tree/derecho_mods

fvitt avatar Jan 09 '24 19:01 fvitt

My colleague using the cesm2.2.2 release tag is getting greater than 2X slow-down just by turning on 2 tapes of 3-hourly output. These are 1/8deg SE var-res simulations.

w/o 3 hourly:

  Overall Metrics:
    Model Cost:           83631.84   pe-hrs/simulated_year
    Model Throughput:         0.33   simulated_years/day

w/ 3-hourly

  Overall Metrics:
    Model Cost:          180784.32   pe-hrs/simulated_year
    Model Throughput:         0.15   simulated_years/day

That seems like an unreasonable slowdown to me. @fvitt should I ask him to try making these mods https://github.com/ESMCI/cime/pull/4559 ?

adamrher avatar Mar 06 '24 19:03 adamrher

My colleague using the cesm2.2.2 release tag is getting greater than 2X slow-down just by turning on 2 tapes of 3-hourly output. These are 1/8deg SE var-res simulations.

w/o 3 hourly:

  Overall Metrics:
    Model Cost:           83631.84   pe-hrs/simulated_year
    Model Throughput:         0.33   simulated_years/day

w/ 3-hourly

  Overall Metrics:
    Model Cost:          180784.32   pe-hrs/simulated_year
    Model Throughput:         0.15   simulated_years/day

That seems like an unreasonable slowdown to me. @fvitt should I ask him to try making these mods ESMCI/cime#4559 ?

Yes, try those changes

fvitt avatar Mar 06 '24 19:03 fvitt

That did it! Thanks Francis.

w/ 3-hourly

Overall Metrics:
    Model Cost:           90887.91   pe-hrs/simulated_year
    Model Throughput:         0.30   simulated_years/day

adamrher avatar Mar 07 '24 20:03 adamrher