ClimaCoupler.jl
ClimaCoupler.jl copied to clipboard
surface_fraction error with 2 GPUs
When we run simulations on 2 GPUs, we get an error that the surface fractions don't sum up to 1. This doesn't happen when running on 1 or 4 GPUs. See examples on buildkite here or here. Seen both when run on buildkite and when run via slurm directly on clima
stacktrace:
ERROR: LoadError: AssertionError: minimum((cs.surface_fractions.ice .+ cs.surface_fractions.land) .+ cs.surface_fractions.ocean) ≈ FT(1) -- | Stacktrace: | [1] update_surface_fractions!(cs::CoupledSimulation{…}) | @ ClimaCoupler.Regridder /scratch/clima/slurm-buildkite/climacoupler-target-gpu-simulations/44/climacoupler-target-gpu-simulations/src/Regridder.jl:513 | [2] top-level scope | @ /scratch/clima/slurm-buildkite/climacoupler-target-gpu-simulations/44/climacoupler-target-gpu-simulations/experiments/AMIP/coupler_driver.jl:559 | in expression starting at /scratch/clima/slurm-buildkite/climacoupler-target-gpu-simulations/44/climacoupler-target-gpu-simulations/experiments/AMIP/coupler_driver.jl:559
when run interactively, we see that minimum((cs.surface_fractions.ice .+ cs.surface_fractions.land) .+ cs.surface_fractions.ocean) == 0
. We expect this sum to always be exactly 1 at each point on the sphere.