Add Antarctic Ice Sheet meshes for MALI
This PR introduces two new meshes of the Antarctic Ice Sheet for MALI:
- high res production mesh: mpas.ais4to20km
- low-res testing mesh: mpas.ais8to30km
Associated with these MALI meshes are 5 new E3SM model_grids:
- TL319_oQU240wLI_ais8to30 - ultra low res mesh for testing GG cases with JRA forcing
- ne30pg2_r05_IcoswISC30E3r5_ais8to30 - v3 low res mesh for BG and IG cases with low res Antarctica
- TL319_IcoswISC30E3r5_ais8to30 - v3 low res mesh for GG cases with JRA forcing with low res Antarctica
- ne30pg2_r05_IcoswISC30E3r5_ais4to20 - v3 low res mesh for BG and IG cases with high res Antarctica
- TL319_IcoswISC30E3r5_ais4to20 - v3 low res mesh for GG with JRA forcing cases with high res Antarctica
This PR previously discussed at https://github.com/E3SM-Ocean-Discussion/E3SM/pull/97
PR Preview Action v1.4.7
:---:
:rocket: Deployed preview to https://E3SM-Project.github.io/E3SM/pr-preview/pr-6440/
on branch gh-pages at 2024-08-13 17:21 UTC
@matthewhoffman and @jonbob, I rebased onto master, fixed up conflicts, and tried to run:
./create_test --wait --walltime 1:00:00 SMS_Ld5.TL319_oQU240wLI_ais8to30.MPAS_LISIO_JRA1p5.chrysalis_intel.mpaso-ocn_glcshelf
and am seeing:
Errors were:
env_batch.xml appears to have changed, regenerating batch scripts
manual edits to these file will be lost!
wget failed with output: and errput --2024-05-28 12:36:09-- https://web.lcrc.anl.gov/public/e3sm/inputdata/glc/mpasli/mpas.ais8to30km/ais8to30km.20231222.nc
Resolving web.lcrc.anl.gov (web.lcrc.anl.gov)... 140.221.70.30
Connecting to web.lcrc.anl.gov (web.lcrc.anl.gov)|140.221.70.30|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2024-05-28 12:36:10 ERROR 404: Not Found.
ERROR: Could not find all inputdata on any server
I believe the issue is that the file is called ais_8to30km_20231222.nc instead of ais8to30km.20231222.nc. I'm not sure if we want to fix the filename or the branch.
The same problem exists with the 4to20km mesh:
$ ls /lcrc/group/e3sm/public_html/inputdata/glc/mpasli/mpas.ais4to20km/
ais_4to20km_20230105.nc
...
This should be ais4to20km.20230105.nc.
I'm not sure if we want to fix the filename or the branch.
I think we need to rename the files because I don't think there's support for having the prefix and datestamp be separated by an underscore.
@matthewhoffman, I think you need to make all the files in inputdata group read/write:
$ pwd
/lcrc/group/e3sm/public_html/inputdata/glc/mpasli/mpas.ais8to30km
$ ls -lah
total 177M
drwxr-sr-x 2 ac.mhoffman E3SM 4.0K May 7 10:54 .
drwxrwsr-x 10 jacob E3SM 4.0K May 4 08:32 ..
-rw-r----- 1 ac.mhoffman E3SM 148M May 4 08:34 ais_8to30km_20231222.nc
-rw-r----- 1 ac.mhoffman E3SM 6.1M May 4 08:34 ais_8to30km_20231222.regionMask_ismip6.nc
-rw-r----- 1 ac.mhoffman E3SM 17M May 4 08:34 ais_8to30km_20231222.scrip.nc
-rw-r----- 1 ac.mhoffman E3SM 3.4M May 4 08:34 mpasli.graph.info.240507
-rw-r--r-- 1 ac.mhoffman E3SM 377K May 7 10:51 mpasli.graph.info.240507.part.1024
-rw-r--r-- 1 ac.mhoffman E3SM 302K May 4 08:38 mpasli.graph.info.240507.part.128
-rw-r--r-- 1 ac.mhoffman E3SM 425K May 4 08:38 mpasli.graph.info.240507.part.1920
-rw-r--r-- 1 ac.mhoffman E3SM 341K May 4 08:38 mpasli.graph.info.240507.part.240
-rw-r--r-- 1 ac.mhoffman E3SM 343K May 4 08:38 mpasli.graph.info.240507.part.256
-rw-r--r-- 1 ac.mhoffman E3SM 453K May 4 08:38 mpasli.graph.info.240507.part.3840
-rw-r--r-- 1 ac.mhoffman E3SM 363K May 4 08:38 mpasli.graph.info.240507.part.480
-rw-r--r-- 1 ac.mhoffman E3SM 364K May 4 08:38 mpasli.graph.info.240507.part.512
-rw-r--r-- 1 ac.mhoffman E3SM 274K May 4 08:37 mpasli.graph.info.240507.part.64
-rw-r--r-- 1 ac.mhoffman E3SM 374K May 4 08:38 mpasli.graph.info.240507.part.960
You also need to make everything world readable so it can be downloaded form the inputdata server.
Please run:
cd /lcrc/group/e3sm/public_html/inputdata/glc/mpasli
chmod -R ug+rwX mpas.ais8to30km mpas.ais4to20km
chmod -R o+rX mpas.ais8to30km mpas.ais4to20km
@xylar -- thanks for taking on the initial testing. There are a bunch of mapping files I need to make before any of the new resolutions can work, but also maybe there will be issues with file permissions
@jonbob, I think everything is readable so go ahead with mapping files. I don't think you need to add anything to the read-only directory for that purpose, so you should be good until @matthewhoffman is able to change permissions.
@matthewhoffman, the Slack bot is bugging me about this one. Have you had a chance to change permissions and fix the issues I pointed out above?
@xylar and @jonbob , sorry about the permissions issue. I have updated the permissions on that directory, so try again and let me know if you have any further problems.
@matthewhoffman, it doesn't look like the filenames have been fixed, see https://github.com/E3SM-Project/E3SM/pull/6440#issuecomment-2135792045 and https://github.com/E3SM-Project/E3SM/pull/6440#issuecomment-2135795364 above. Could you take care of that, too?
@matthewhoffman, also, could you rebase onto master to resolve conflicts?
@matthewhoffman, I think something you did to update this branch (probably a rebase) also took out an earlier commit that added the ocn_glcshelf test.
Can you make sure you can run the following on Chrysalis and ping me to re-review after that works for you?
./create_test --wait --walltime 1:00:00 SMS_Ld5.TL319_oQU240wLI_ais8to30.MPAS_LISIO_JRA1p5.chrysalis_intel.mpaso-ocn_glcshelf
@xylar , sorry about all the little issues in this PR and for letting it sit for so long. I've looked through it all and made the following changes:
- the ais mesh naming convention has been fixed and follow that of the gis meshes. This required both a change to the filenames on the input server (to replace the _ between mesh name and date with .) and change in this PR to include an underscore between 'ais' and the resolution
- the ocn_glcshelf test commit was actually in this other PR that has already been merged: https://github.com/E3SM-Project/E3SM/pull/6437/commits So I have rebased this PR so that is available. That also resolves conflicts with master.
With these changes, I ran ./create_test --wait --walltime 1:00:00 SMS_Ld5.TL319_oQU240wLI_ais8to30.MPAS_LISIO_JRA1p5.chrysalis_intel.mpaso-ocn_glcshelf and got this error:
Errors were:
Building test for SMS in directory /lcrc/group/e3sm/ac.mhoffman/scratch/chrys/SMS_Ld5.TL319_oQU240wLI_ais8to30.MPAS_LISIO_JRA1p5.chrysalis_intel.mpaso-ocn_glcshelf.20240626_125417_w8meoq
WARNING: Should be running with salinity restoring on!
But no file available for this grid.
ERROR: /gpfs/fs1/home/ac.mhoffman/e3sm-gis/E3SM-ais-meshes/share/build/buildlib.csm_share FAILED, cat /lcrc/group/e3sm/ac.mhoffman/scratch/chrys/SMS_Ld5.TL319_oQU240wLI_ais8to30.MPAS_LISIO_JRA1p5.chrysalis_intel.mpaso-ocn_glcshelf.20240626_125417_w8meoq/bld/csm_share.bldlog.240626-125429
Do salinity restoring files need to be created for the oQU240wL?
In the meantime, @jonbob , you could proceed to generate the mapping files that will be needed for this PR.
@matthewhoffman, I think there must be other errors besides salinity restoring. You will get complaints but the test should run.
@xylar , you are right, sorry about that. The error seems to be related to directives in building PIO-related code, which is not something we are touching in this PR.
/gpfs/fs1/home/ac.mhoffman/e3sm-gis/E3SM-ais-meshes/share/util/mct_mod.F90(825): remark #5140: Unrecognized directive
!DIR$ PREFERVECTOR
------------------^
/gpfs/fs1/home/ac.mhoffman/e3sm-gis/E3SM-ais-meshes/share/util/shr_pio_mod.F90(734): error #6404: This name does not have a type, and must have an explicit type. [PIO_REARR_A
NY]
pio_rearranger .ne. PIO_REARR_ANY) then
----------------------------^
compilation aborted for /gpfs/fs1/home/ac.mhoffman/e3sm-gis/E3SM-ais-meshes/share/util/shr_pio_mod.F90 (code 1)
gmake[2]: *** [CMakeFiles/csm_share.dir/build.make:376: CMakeFiles/csm_share.dir/util/shr_pio_mod.F90.o] Error 1
I'll experiment with a few other test definitions to see if this is a pervasive error.
@matthewhoffman, any chance this is a submodule issue? Maybe try a fresh clone or worktree?
I was able to build successfully but I'm getting a segfault at runtime:
5: ==== backtrace (tid:2007194) ====
5: 0 0x0000000000012cf0 __funlockfile() :0
5: 1 0x00000000020512ff mpas_rbf_interpolation_mp_mpas_rbf_interp_func_3d_plane_vec_const_dir_comp_coeffs_.A() /lcrc/group/e3sm/ac.xylar/scratch/chrys/SMS_Ld5.TL319_oQU240wLI_ais8to30.MPAS_LISIO_JRA1p5.chrysalis_intel.mpaso-ocn_glcshelf.20240627_022044_tr9lrp/bld/cmake-bld/operators/mpas_rbf_interpolation.f90:1700
5: 2 0x000000000206edd0 mpas_vector_reconstruction_mp_mpas_init_reconstruct_.A() /lcrc/group/e3sm/ac.xylar/scratch/chrys/SMS_Ld5.TL319_oQU240wLI_ais8to30.MPAS_LISIO_JRA1p5.chrysalis_intel.mpaso-ocn_glcshelf.20240627_022044_tr9lrp/bld/cmake-bld/operators/mpas_vector_reconstruction.f90:176
5: 3 0x0000000001c829a4 li_core_mp_li_core_init_.A() /lcrc/group/e3sm/ac.xylar/scratch/chrys/SMS_Ld5.TL319_oQU240wLI_ais8to30.MPAS_LISIO_JRA1p5.chrysalis_intel.mpaso-ocn_glcshelf.20240627_022044_tr9lrp/bld/cmake-bld/core_landice/mode_forward/mpas_li_core.f90:1030
5: 4 0x0000000001c58b4a glc_comp_mct_mp_glc_init_mct_() /lcrc/group/e3sm/ac.xylar/scratch/chrys/SMS_Ld5.TL319_oQU240wLI_ais8to30.MPAS_LISIO_JRA1p5.chrysalis_intel.mpaso-ocn_glcshelf.20240627_022044_tr9lrp/mpas-albany-landice/driver/glc_comp_mct.f90:531
5: 5 0x0000000000451e54 component_mod_mp_component_init_cc_() /gpfs/fs1/home/ac.xylar/e3sm_work/E3SM/matthewhoffman/mali/ais-meshes/driver-mct/main/component_mod.F90:257
5: 6 0x000000000043e74c cime_comp_mod_mp_cime_init_() /gpfs/fs1/home/ac.xylar/e3sm_work/E3SM/matthewhoffman/mali/ais-meshes/driver-mct/main/cime_comp_mod.F90:1518
5: 7 0x000000000044eb4a MAIN__() /gpfs/fs1/home/ac.xylar/e3sm_work/E3SM/matthewhoffman/mali/ais-meshes/driver-mct/main/cime_driver.F90:122
5: 8 0x000000000041b1a2 main() ???:0
5: 9 0x000000000003ad85 __libc_start_main() ???:0
5: 10 0x000000000041b0ae _start() ???:0
See
/lcrc/group/e3sm/ac.xasay-davis/scratch/chrys/SMS_Ld5.TL319_oQU240wLI_ais8to30.MPAS_LISIO_JRA1p5.chrysalis_intel.mpaso-ocn_glcshelf.20240627_022044_tr9lrp/
I'll try rerunning in debug mode to see if that provides any helpful info.
In debug mode:
/lcrc/group/e3sm/ac.xasay-davis/scratch/chrys/SMS_D_Ld5.TL319_oQU240wLI_ais8to30.MPAS_LISIO_JRA1p5.chrysalis_intel.mpaso-ocn_glcshelf.20240627_033510_umnvw1
I'm seeing:
56: forrtl: severe (408): fort: (3): Subscript #2 of the array VERTICESONEDGE has value 0 which is less than the lower bound of 1
56:
56: Image PC Routine Line Source
56: libpnetcdf.so.3.0 000015554B9391A2 for_emit_diagnost Unknown Unknown
56: e3sm.exe 00000000065FF55F li_mesh_mp_meshsi 584 mpas_li_mesh.f90
56: e3sm.exe 00000000065FC22C li_mesh_mp_li_mes 416 mpas_li_mesh.f90
56: e3sm.exe 0000000006097154 li_core_mp_li_cor 226 mpas_li_core.f90
56: e3sm.exe 000000000601F4BF glc_comp_mct_mp_g 531 glc_comp_mct.f90
56: e3sm.exe 000000000048BDDF component_mod_mp_ 257 component_mod.F90
56: e3sm.exe 000000000042EC9B cime_comp_mod_mp_ 1518 cime_comp_mod.F90
56: e3sm.exe 000000000048286C MAIN__ 122 cime_driver.F90
56: e3sm.exe 000000000041ABA2 Unknown Unknown Unknown
56: libc-2.28.so 0000155545281D85 __libc_start_main Unknown Unknown
56: e3sm.exe 000000000041AAAE Unknown Unknown Unknown
Presumably, this error is getting caught before the other error so it will need to be handled before we get to the one in RBF reconstruction. It seems like verticesOnEdge is being used where it is invalid and a check is needed?
@xylar , yes, that makes sense - I had forgotten to update the submodules after rebasing. I can look at these MALI errors. I'm confused why they are showing up in this test and haven't shown up in other tests given this is not introducing any new functionality in MALI.
@matthewhoffman, I can't speak to the vector reconstruction error but regarding the out-of-bounds indexing, do you maybe not have any tests that are compiled in debug mode? If not, maybe that one has just gone uncaught?
I added the mapping files to this branch and have staged them in their corresponding locations on the lcrc local inputdata location for testing
@xylar -- your test may have failed because there were no mapping files specified in config_grids. I'm running something similar right now
Aaaah! It would be nice if E3SM gave an error that hinted a bit more in that direction but that certainly does sound like a good reason I was having problems!
@xylar -- ah, I think E3SM would throw an error if a mapping file had been defined, but those mapping file entries were defined but blank so no file to even look for... But I was incorrect, that's not why your test failed. I ran something similar and saw the same error in the e3sm log. But MALI was throwing errors about not having xtime=0.0 in the file, and somehow that's the error e3sm ended up with? Anyway, @matthewhoffman -- I think in general we remove xtime from these files. I tested with that and got a new error about nEdgesOnCell being greater than maxEdges. I checked values in the initial file and saw this:
ncks -H -d nCells,1 -v nEdgesOnCell ais_8to30km.20231222-no-xtime.nc | more
netcdf ais_8to30km.20231222-no-xtime {
dimensions:
nCells = 1 ;
variables:
int nEdgesOnCell(nCells) ;
data:
nEdgesOnCell = 1072693248 ;
} // group /
so something is very wrong with ais_8to30km.20231222.nc
@jonbob and @xylar , thanks for brining these issues to my attention. As discussed, @trhille and I are working on an updated 4km initial condition. I'll update the PR with that file and investigate the 8km file issue after returning from the SciDAC meeting next week (or possibly while I'm there).
I’ve re-evaluated this PR carefully, and I ended up finding issues with both the 8km and 4km mesh files.
For the 8km mesh, I realized I had introduced a mesh that had not completed QAQC testing, so I rolled back to the previous version of the mesh that we had used extensively for ISMIP6-2300. The ice thickness initial condition makes it overly prone to Thwaites Glacier retreat, but given the purpose of this mesh is testing, that’s not really a concern (and maybe it’s actually useful).
For the 4km mesh, two issues have been fixed:
- Trevor, Xylar, and I identified an issue with artifacts in ice-shelf thickness in the Amundsen Sea sector due to an error in how we had stitched some datasets together. This does not affect the ability of MALI to run, but it does introduce unnecessary complications into generating consistent MPAS-Ocean ice-shelf cavities. Given that is an ongoing goal of introducing these meshes, we decided to resolve that in this PR, and the 4km mesh file has now been updated with a version that eliminates the problem.
- I also discovered that the decomposition files for the 4km mesh were somehow incorrect, so I’ve updated those on the inputdata server.
With these updates, I ran tests for all the new meshes:
./create_test --wait --walltime 1:00:00 SMS_Ld5.TL319_oQU240wLI_ais8to30.MPAS_LISIO_JRA1p5.chrysalis_intel.mpaso-ocn_glcshelf
PASSED
./create_test --wait --walltime 1:00:00 SMS_Ld5.TL319_IcoswISC30E3r5_ais8to30.MPAS_LISIO_JRA1p5.chrysalis_intel.mpaso-ocn_glcshelf
PASSED
./create_test --wait --walltime 1:00:00 SMS_Ld5.TL319_IcoswISC30E3r5_ais4to20.MPAS_LISIO_JRA1p5.chrysalis_intel.mpaso-ocn_glcshelf
FAILED - MALI CFL ERROR
The failure on the last item in the list is because it is attempting to use the SIA solver at relatively high resolution, including ice shelves, which is not a good idea and is likely to give unrealistic velocities that would trigger a CFL error. It got far enough (into the first GLC coupling interval) that I think everything about the mesh is working fine and I just need to come up with a better test. I also need to come up with tests for the two mesh_grids being added for B-cases. I will coordinate with @jonbob on both of these two issues.
With the addition of a FOLISIO compset, I was able now successfully test the third grid that had failed in my previous testing:
./create_test --wait --walltime 1:00:00 SMS_Ld5.TL319_IcoswISC30E3r5_ais4to20.MPAS_FOLISIO_JRA1p5.chrysalis_gnu.mpaso-ocn_glcshelf
PASSED
The remaining task is to demonstrate successful tests of the two remaining grids added in this PR, which will require B-cases.
I've successfully tested the two fully-coupled grid specifications:
./create_test --wait --walltime 1:00:00 SMS_Ld5.ne30pg2_r05_IcoswISC30E3r5_ais8to30.BGWCYCL1850.chrysalis_gnu
PASS
./create_test --wait --walltime 1:00:00 SMS_Ld5.ne30pg2_r05_IcoswISC30E3r5_ais4to20.BGWCYCL1850.chrysalis_gnu
PASS
with a couple caveats:
- I had to do a temporary merge of https://github.com/E3SM-Project/E3SM/pull/6514 to get FO MALI with Albany available (needed to utilize the existing BG compset definition, which requires Albany)
- I could not enable the ocn-glcshelf coupling in these tests because the testMod that supports that also includes modifications specific to a JRA G-case (https://github.com/E3SM-Project/E3SM/blob/master/components/mpas-ocean/cime_config/testdefs/testmods_dirs/mpaso/ocn_glcshelf/shell_commands)
But this confirms that these new AIS meshes work in a B-case and the mapping files are correct.
(thanks, @jonbob , for helping me sort through all these details!)
@xylar , do you think we should deal with enabling ocn_glcshelf coupling for B-cases in this PR? My preference is to leave it out, as there will likely be other adjustments we need to make to support B-cases with iceshelf coupling, but I'm open to addressing it here. If the answer is no, then I have completed all necessary updates and testing and this PR is ready for re-review from @jonbob and @xylar .
@xylar , do you think we should deal with enabling ocn_glcshelf coupling for B-cases in this PR?
@matthewhoffman, no, I agree that we need to leave that for another PR at a later time.
Update: I think the existing BG test cases would already test melt fluxes in coupled mode if you were to run with PISMF, see below. That's the case for CRYO configurations other than with IcoswISC30E3r5 but not for WCYCL.