RXCROPMATURITYSKIPGEN test can fail for file length of fsurdat beyond 255
Brief summary of bug
RXCROPMATURITYSKIPGEN_Ld1097.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-cropMonthOutput
fails because the fsurdat file it creates for the case is too long. The filename it creates include the testname which is long, a filename which is fairly long, the test-id, and then the path to scratch. This failed for me with a testid that was using the temporary branch tag name which is longer than a standard CTSM tag name.
One way to remove this possibility is to increase the character length for files. We are using 255 right now because there was a limitation in the past. And currently we have the code with lots of len=256 for filenames.
General bug information
CTSM version you are using: branch_tags/tmp-241219.n01.ctsm5.3.016-23-g1224d97a7
Does this bug cause significantly incorrect results in the model's science? no Configurations affected: Tests with too long of a filename for fsurdfat
Fails in the build-namelist step
Important output or errors that show the problem
Dies in build-namelist with:
2025-01-10 10:23:17: ERROR: Command /glade/work/erik/ctsm_worktrees/newbranch/bld/build-namelist failed rc=255
out=
err=ERROR : CLM build-namelist::CLMBuildNamelist::process_namelist_infile() : Invalid namelist variable in '-infile' /glade/derecho/scratch/erik/tests_tmp241219n1ctsm5316erikb4bacl/RXCROPMATURITYSKIPGEN_Ld1097.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-cropMonthOutput.GC.tmp241219n1ctsm5316erikb4bacl_int.gddgen/Buildconf/clmconf/namelist.
ERROR: in validate_variable_value (package Build::Namelist): Variable name fsurdat has a string element that is too long: '/glade/derecho/scratch/erik/tests_tmp241219n1ctsm5316erikb4bacl/RXCROPMATURITYSKIPGEN_Ld1097.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-cropMonthOutput.GC.tmp241219n1ctsm5316erikb4bacl_int.gddgen/surfdata_10x15_hist_1850_78pfts_c240908.all_crops_everywhere.nc'
We've brought in changes that allow filename lengths to be 512, so we shouldn't see this anymore. So closing.
Actually, because of #3590 this isn't really resolved and it now fails at runtime rather than preview_namelist time.
cesm.log:
dec0408.hsn.de.hpc.ucar.edu 0: (t_initf) profile_papi_enable= F
dec0408.hsn.de.hpc.ucar.edu 0: Abort with message Specified netCDF file does not exist. in file /glade/derecho/scratch/csgteam/temp/spack/derecho/24.12/builds/spack-stage-parallelio-2.6.6-5oefk2r5g2vtjc6igyiallo5pe5q6zep/spack-src/src/clib/pioc_support.c at line 2857
dec0408.hsn.de.hpc.ucar.edu 1: Abort with message Specified netCDF file does not exist. in file /glade/derecho/scratch/csgteam/temp/spack/derecho/24.12/builds/spack-stage-parallelio-2.6.6-5oefk2r5g2vtjc6igyiallo5pe5q6zep/spack-src/src/clib/pioc_support.c at line 2857
dec0408.hsn.de.hpc.ucar.edu 8: Abort with message Specified netCDF file does not exist. in file /glade/derecho/scratch/csgteam/temp/spack/derecho/24.12/builds/spack-stage-parallelio-2.6.6-5oefk2r5g2vtjc6igyiallo5pe5q6zep/spack-src/src/clib/pioc_support.c at line 2857
dec0408.hsn.de.hpc.ucar.edu 8: Obtained 10 stack frames.
dec0408.hsn.de.hpc.ucar.edu 8: /glade/u/apps/derecho/24.12/spack/opt/spack/parallelio/2.6.6/cray-mpich/8.1.29/oneapi/2024.2.1/5oef/lib/libpioc.so(piodie+0x54) [0x15461e212334]
dec0408.hsn.de.hpc.ucar.edu 8: /glade/u/apps/derecho/24.12/spack/opt/spack/parallelio/2.6.6/cray-mpich/8.1.29/oneapi/2024.2.1/5oef/lib/libpioc.so(check_netcdf2+0xc4) [0x15461e2127a4]
dec0408.hsn.de.hpc.ucar.edu 8: /glade/u/apps/derecho/24.12/spack/opt/spack/parallelio/2.6.6/cray-mpich/8.1.29/oneapi/2024.2.1/5oef/lib/libpioc.so(PIOc_openfile_retry+0x665) [0x15461e2161f5]
dec0408.hsn.de.hpc.ucar.edu 8: /glade/u/apps/derecho/24.12/spack/opt/spack/parallelio/2.6.6/cray-mpich/8.1.29/oneapi/2024.2.1/5oef/lib/libpioc.so(PIOc_openfile+0x13) [0x15461e211633]
dec0408.hsn.de.hpc.ucar.edu 8: /glade/u/apps/derecho/24.12/spack/opt/spack/parallelio/2.6.6/cray-mpich/8.1.29/oneapi/2024.2.1/5oef/lib/libpiof.so(piolib_mod_mp_pio_openfile_+0x18c) [0x15461e61851c]
dec0408.hsn.de.hpc.ucar.edu 8: /glade/derecho/scratch/erik/tests_Actsm54CMIP719ctsm5385acl/RXCROPMATURITYSKIPGEN_Ld1097.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-cropMonthOutput.GC.Actsm54CMIP719ctsm5385acl_int/bld/cesm.exe() [0x5c1227]
dec0408.hsn.de.hpc.ucar.edu 8: /glade/derecho/scratch/erik/tests_Actsm54CMIP719ctsm5385acl/RXCROPMATURITYSKIPGEN_Ld1097.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-cropMonthOutput.GC.Actsm54CMIP719ctsm5385acl_int/bld/cesm.exe() [0x5b5f64]
dec0408.hsn.de.hpc.ucar.edu 8: /glade/derecho/scratch/erik/tests_Actsm54CMIP719ctsm5385acl/RXCROPMATURITYSKIPGEN_Ld1097.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-cropMonthOutput.GC.Actsm54CMIP719ctsm5385acl_int/bld/cesm.exe() [0x5a67b4]
dec0408.hsn.de.hpc.ucar.edu 8: /glade/u/apps/derecho/24.12/spack/opt/spack/esmf/8.8.1/cray-mpich/8.1.29/oneapi/2024.2.1/oigq/lib/libesmf.so(_ZN5ESMCI6FTable12callVFuncPtrEPKcPNS_2VMEPi+0x4e0) [0x15461bd04020]
dec0408.hsn.de.hpc.ucar.edu 8: /glade/u/apps/derecho/24.12/spack/opt/spack/esmf/8.8.1/cray-mpich/8.1.29/oneapi/2024.2.1/oigq/lib/libesmf.so(ESMCI_FTableCallEntryPointVMHop+0x1ec) [0x15461bd03a6c]
dec0408.hsn.de.hpc.ucar.edu 8: MPICH ERROR [Rank 8] [job id b992eef1-e0f8-463f-9e8a-12b58c829eef] [Tue Nov 25 10:19:31 2025] [dec0408] - Abort(-1) (rank 8 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, -1) - process 8
lnd.log:
Obtaining land mask and fraction from mask file /glade/campaign/cesm/cesmdata/inputdata/share/meshes/gx3v7_120309_ESMFmesh.nc
Attempting to read global dimensions from surface dataset
(GETFIL): attempting to find local file
surfdata_10x15_hist_1850_78pfts_c251022.all_crops_everywhere.nc
(GETFIL): using /glade/derecho/scratch/erik/tests_Actsm54CMIP719ctsm5385acl/RXCROPMATURITYSKIPGEN_Ld1097.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-cropMonthOutput.GC.Actsm54CMIP719ctsm5385acl_int.gddgen/surfdata_10x15_hist_1850_78pfts_c251022.all_crops_everywhere.nc
The filename at the end is longer than 255 (258) and as such it isn't finding it. The place where it prints it must allow longer lengths, but at some point it must get truncated.
Since, this was a downgrade from failing at preview_namelist time to at runtime, we should discuss resolving #3590, with either changing the namelist_definition file to agree with the lengths in the Fortran source or to update the Fortran to the right lengths. I don't think either option should take very long, and they could come to b4b-dev to make it easier to do.