ClimaOcean.jl icon indicating copy to clipboard operation
ClimaOcean.jl copied to clipboard

Implementing checkpointer for coupled model

Open taimoorsohail opened this issue 1 year ago • 16 comments

This PR supersedes #381 ~adding a new run_coupled! function that replaces the run! function in Oceananigans.~ adds new methods so that checkpointing works for coupled ocean-sea ice simulations with PrescribedAtmosphere.

with @navidcy

taimoorsohail avatar Mar 11 '25 11:03 taimoorsohail

I don't agree with the design that creates a new function called run_coupled!. However, one could extend run! for a Simulation of OceanSeaIceModel. I'm not sure if there is a reason to keep both versions of run!.

glwagner avatar Mar 11 '25 15:03 glwagner

I don't agree with the design that creates a new function called run_coupled!. However, one could extend run! for a Simulation of OceanSeaIceModel. I'm not sure if there is a reason to keep both versions of run!.

Yes, we also agree. We don't like that either! We were playing around with extending run! but couldn't figure out a way to differentiate the methods...

Somehow this:

function run!(simulation::Simulation{<:OceanSeaIceSimulation}`; pickup)
    if pickup
        ... do extra things to pick up a coupled model
    end

    run!(simulation) # Oceananigans.Simulations.run!
    return nothing
end

wasn't working for us yesterday.

Don't worry too much atm, it's mostly playing around. When we settle or if we get stuck we can ping you.

navidcy avatar Mar 11 '25 21:03 navidcy

Codecov Report

Attention: Patch coverage is 0% with 39 lines in your changes missing coverage. Please review.

Project coverage is 0.00%. Comparing base (9efec12) to head (9f9ca89).

Files with missing lines Patch % Lines
src/OceanSeaIceModels/ocean_sea_ice_model.jl 0.00% 21 Missing :warning:
src/OceanSeaIceModels/PrescribedAtmospheres.jl 0.00% 17 Missing :warning:
src/ClimaOcean.jl 0.00% 1 Missing :warning:
Additional details and impacted files
@@          Coverage Diff          @@
##            main    #401   +/-   ##
=====================================
  Coverage   0.00%   0.00%           
=====================================
  Files         40      40           
  Lines       2300    2329   +29     
=====================================
- Misses      2300    2329   +29     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Mar 12 '25 06:03 codecov[bot]

Linking issue #404 to this PR

taimoorsohail avatar Mar 12 '25 07:03 taimoorsohail

I don't agree with the design that creates a new function called run_coupled!. However, one could extend run! for a Simulation of OceanSeaIceModel. I'm not sure if there is a reason to keep both versions of run!.

Yes, we also agree. We don't like that either! We were playing around with extending run! but couldn't figure out a way to differentiate the methods...

Somehow this:

function run!(simulation::Simulation{<:OceanSeaIceSimulation}`; pickup)
    if pickup
        ... do extra things to pick up a coupled model
    end

    run!(simulation) # Oceananigans.Simulations.run!
    return nothing
end

wasn't working for us yesterday.

Don't worry too much atm, it's mostly playing around. When we settle or if we get stuck we can ping you.

To implement this, you need to first work on a PR in Oceananigans that exposes an interface for "picking up a simulation", which has a default implementation in Oceananigans, but which can be specialized for the type of Simulation. Then you can specialize for a simulation of OceanSeaIceModel here.

glwagner avatar Mar 12 '25 13:03 glwagner

After https://github.com/CliMA/ClimaOcean.jl/pull/401/commits/9ab0df109e3cb05aef99f2f44d98b8246857faf9 I can run the checkpointer_mwe.jl with Oceananigans 0.95.26 (not the branch from https://github.com/CliMA/Oceananigans.jl/pull/4216)!

navidcy avatar Mar 13 '25 13:03 navidcy

@glwagner can you review? if you are happy we can convert the checkpointer_mwe.jl into a test and we are good to go I think

navidcy avatar Mar 13 '25 22:03 navidcy

cc @NoraLoose

navidcy avatar Mar 13 '25 22:03 navidcy

Just want to write that @simone-silvestri and I are discussing how to generalize OceanSeaIceModel so that it is truly a coupled model without a "principle" component. This will require changing code that assumes the ocean model is the "source of truth". Just an FYI that if it is possible to avoid this assumption in the checkpointing code, it would be nice. Basically there should be symmetry among all the models, eventually. And we would want a set_clock! for the ocean too (which may in fact belong in Oceananigans). set_clock! for PrescribedAtmosphere does belong here, because this is where PrescribedAtmosphere is defined.

glwagner avatar Mar 14 '25 05:03 glwagner

Does set_clock! also need the previous time-step to be correct? Otherwise stateful time-steppers like AB2 will not be picked up correctly.

glwagner avatar Mar 14 '25 05:03 glwagner

OK, I tried to make the checkpointer not assuming that ocean model is special but I think it's a lot of work. Perhaps we can put it off until later? :(

I was thinking to define a checkpointer on the OceanSeaIceModel and then redefine write_output! and all other methods so that the checkpointer goes and writes all the required properties of all the OceanSeaIceModel and all its model components (eg atmosphere, ocean, etc). Then one need not assume that ocean is special.

I was envisioning something along the lines.

function write_output!(c::Checkpointer, model::OSIM)
    atmosphere = model.atmosphere
    ocean = model.ocean.model

    write_output!(c, model) # just saves the clock

    # deals with model components
    write_output!(c, atmosphere)
    write_output!(c, ocean)

    return nothing
end

navidcy avatar Mar 16 '25 20:03 navidcy

I don't know what's the best workflow to do this because it requires some tweaking of Oceananigans to expose some of the functionality of the Checkpointer which then has to be extended.

Shall we put this off for later after the above-mentioned redesign of the OceanSeaIceModel @glwagner?

navidcy avatar Mar 16 '25 20:03 navidcy

Does set_clock! also need the previous time-step to be correct? Otherwise stateful time-steppers like AB2 will not be picked up correctly.

Just checking

Does the set_clock! change 4110aca3ce4c0f6adf767d9fdaebd0be825504ef resolve this @navidcy ?

taimoorsohail avatar Mar 16 '25 23:03 taimoorsohail

Just note that I’m refactoring this. It’s almost there. But haven’t pushed yet.

navidcy avatar Mar 19 '25 02:03 navidcy

x-ref https://github.com/CliMA/Oceananigans.jl/pull/4250

navidcy avatar Mar 20 '25 04:03 navidcy

I plan to work on this early next week

navidcy avatar Apr 30 '25 08:04 navidcy