noel
noel
I don't think the case is running out of memory. I think it's not working with GNU as it has not been tried. I tried `SMS_D_P6400.ne120pg2_r0125_oRRS18to6v3.WCYCL1950` with GNU and it...
I wanted to try again on pm-cpu. However, this time I get a different error. Note that I've been able to run other WC cases at this resolution with diff...
OK, the error above may have been spurious (submitted again). I can actually run with even fewer nodes. `SMS_P2048.ne120pg2_r0125_oRRS18to6v3.WCYCL1950.pm-cpu_gnu` will run on 32 nodes in OPT, however, the DEBUG attempt...
Using master of Jun16th, DEBUG with GNU: ``` /lcrc/group/e3sm/ac.ndkeen/scratch/chrys/m28-jun16/SMS_D_P4096.ne120pg2_r0125_oRRS18to6v3.WCYCL1950.chrysalis_gnu.20220616_180843_clx6ze ``` And DEBUG with Intel: ``` /lcrc/group/e3sm/ac.ndkeen/scratch/chrys/m28-jun16/SMS_D_P4096.ne120pg2_r0125_oRRS18to6v3.WCYCL1950.chrysalis_intel.20220616_180852_d91yzm ``` Reproduced with newer repo as well (Sep 23rd) with GNU, but the Intel...
I see the same issue with May 18th master using chrysalis. `SMS_D.f09_g16.I1850ELMCN.chrysalis_gnu.elm-bgcinterface`
Adding debug prints before the file is opened, I see that the file is there and has following type: ``` gcp-e3sm10-login0% ncdump -k /home/inputdata/atm/cam/ggas/GHG_CMIP-1-2-0_Annual_Global_0000-2014_c20180105.nc netCDF-4 classic model ``` Is it...
In a previous issue https://github.com/E3SM-Project/E3SM/issues/4570, we did find input files that had `netCDF-4` format. The issue was corrected after those files were converted to `netcdf3`. In this case, the ncdump...
Noting this is still an issue -- not sure the process to get files converted to a different format. ``` cori06% ncdump -k /global/cfs/cdirs/e3sm/inputdata/atm/cam/ggas/GHG_CMIP-1-2-0_Annual_Global_0000-2014_c20180105.nc netCDF-4 classic model ```
If I try `SMS_D.f09_g16.I1850ELMCN` on chrysalis, I do get FP issue as suggested. Looking closer into this one, it looks like the array holding index values is what might be...
I think in general, we benefit from having floating-point traps. If I understand, it sounds like the fp trapping did catch this error? I would just like to see example...