E3SM icon indicating copy to clipboard operation
E3SM copied to clipboard

EAMxx: conditional sampling + horiz_remap outputs contain many big values

Open jsbamboo opened this issue 2 months ago • 5 comments

Some conditional sampling outputs for an EAMxx RRM all have big values (seems like they didn't mask the missing values in the online calculation). Strangely, for DPxx, the "big value" issue only appears in time-avg outputs, but for WPRRMxx with horiz_remapper, it also appears in the instant outputs. So, horiz_avg and horiz_remapper seems to behave differently in this case?

This issue has no major impact on DPxx since the instant outputs have no problem, but it makes the diagnostics infeasible for WPRRMxx, as the instant outputs are also affected and the proportion of big values is too high (>50% ~ 100% in my tests), leaving very few usable data points after manually filtering them with a threshold.

mach: LC dane

============ examples:

  1. wprrmxx's yaml: https://portal.nersc.gov/cfs/e3sm/zhang73/yaml/eamxx_wprrm.1hI_1x1_160E170E2.5N12.5N_Betts.yaml
    • horiz_remap_file: /p/lustre2/zhang73/grids2/WP10ne32x32v1/map_WP10ne32x32v1pg2_to_1x1_2.5N12.5N160E170E.nco.20250723.nc
      • this remapper works as a global -> regional horizontal average
    • vars to check:
      • <1hI> omega_where_omega_le_-1: all big values
      • <1hI> omega_where_omega_gt_-1: mostly big values
      • <1hI> qv_zvert_derivative_where_omega_le_-1: all big values (not shown)
      • <1hI> RelativeHumidity_where_omega_le_-1: all big values (not shown)
      • <1hI> RelativeHumidity_where_omega_gt_-1: mostly big values (not shown)
ncdump -v wprrmxx results
[zhang73@dane4:tests]$ ncdump -v omega_where_omega_le_-1  /p/lustre1/zhang73/E3SM_simulations/wprrmxx_p3/WPRRMxx_250901_ctp_qmr.WP10ne32x32v1pg2_WP10ne32x32v1pg2.F2010-SCREAMv1.dane/tests/2240x1_nmonthsx1_E3SMv1SSP585-UVTQ2d-s20151001-O40-5minAsite-rad3/run/eamxx_wprrm.1hI.1x1_160E170E2.5N12.5N_Betts.h.INSTANT.nhours_x1.2015-10-01-00000.nc |less
 omega_where_omega_le_-1 =
  _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
...
    _, _, _, _, _, _, _, _, _, 3.402809e+33, 3.402794e+33, 3.40278e+33, 
    3.402769e+33, 3.402692e+33, 3.402607e+33, 3.402344e+33, 3.402129e+33, 
    3.402096e+33, 3.402162e+33, 3.401976e+33, 3.401382e+33, 3.399882e+33, 
    3.397441e+33, 3.395065e+33, 3.391964e+33, 3.389787e+33, 3.386288e+33, 
...
    3.398103e+33, 3.398337e+33, 3.398812e+33, 3.399447e+33, 3.400093e+33, 
    3.401346e+33, 3.401912e+33, 3.402603e+33, 3.402735e+33, _, 3.402757e+33, 
    3.402757e+33, 3.402757e+33, _, _, _,
  _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 

(all_stable) [zhang73@dane4:tests]$ ncdump -v omega_where_omega_gt_-1  /p/lustre1/zhang73/E3SM_simulations/wprrmxx_p3/WPRRMxx_250901_ctp_qmr.WP10ne32x32v1pg2_WP10ne32x32v1pg2.F2010-SCREAMv1.dane/tests/2240x1_nmonthsx1_E3SMv1SSP585-UVTQ2d-s20151001-O40-5minAsite-rad3/run/eamxx_wprrm.1hI.1x1_160E170E2.5N12.5N_Betts.h.INSTANT.nhours_x1.2015-10-01-00000.nc |less
 omega_where_omega_gt_-1 =
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0,
  -1.999197e-05, 7.102519e-05, 0.0002220787, 0.0001150266, -0.0002823437, 
    -0.0005967336, -0.0006404763, -0.0006527645, -0.0007655454, 
    -0.0009198438, -0.001071147, -0.001246906, -0.001495592, -0.001833205, 
    -0.002134913, -0.002229567, -0.002058582, -0.001682035, -0.001190887, 
    -0.0006426178, -5.694841e-05, 0.0005931052, 0.001290996, 0.001813285, 
    0.001981848, 0.001779855, 0.001140188, 2.528251e-05, -0.001493386, 
    -0.003256344, -0.005135644, -0.007078507, -0.009116182, -0.01134611, 
    1.469836e+28, 3.014397e+28, 4.409507e+28, 5.430967e+28, 1.319556e+29, 
    2.163253e+29, 4.801176e+29, 6.941358e+29, 7.277588e+29, 6.618164e+29, 
    8.47946e+29, 1.441915e+30, 2.941296e+30, 5.382669e+30, 7.758095e+30, 
...
  1. dpxx's yaml: https://portal.nersc.gov/cfs/e3sm/zhang73/yaml/dpxx.1hA_Davg_Betts.yaml https://portal.nersc.gov/cfs/e3sm/zhang73/yaml/dpxx.5minI_Davg_Betts.yaml
    • vars to check
      • <1hA> omega_where_omega_le_-1_horiz_avg: partial big value issue
      • <5minI> omega_where_omega_le_-1_horiz_avg: no big value issue
ncdump -v dpxx results
[zhang73@dane4:run]$ ncdump -v omega_where_omega_le_-1_horiz_avg  /p/lustre1/zhang73/E3SM_simulations/wprrmxx_p3/dpxx-RCE.WPRRMxx_250901_ctp_qmr.dane/tests/custom-12_ndaysx100_D500km_lat0-sst300-O33Betts/run/dpxx.1hA.Davg_Betts.h.AVERAGE.nhours_x1.2000-01-01-00000.nc |less
...
  _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, _, _, _, _, _, _, _, _, _, _, _, 2.722259e+33, 2.165433e+33, 
    1.701412e+33, 9.722353e+32, 6.805647e+32, 2.001661e+32, -6.306013, 
    -6.398267, -6.173028, -5.587985, -5.182454, -4.724482, -4.456267, 
    -4.207887, -3.975696, -3.806512, -3.634893, -3.459618, -3.313383, 
    -3.16818, -2.998721, -2.874234, -2.727672, -2.591891, -2.484372, 
    -2.377534, -2.306149, -2.249083, -2.206445, -2.145779, -2.092333, 
    -2.015078, -1.997679, -1.980281, -1.804732, 2.001661e+32, 6.805647e+32, 
    1.497242e+33, 2.722259e+33, _, _, _, _,

[zhang73@dane4:run]$ ncdump -v omega_where_omega_le_-1_horiz_avg  /p/lustre1/zhang73/E3SM_simulations/wprrmxx_p3/dpxx-RCE.WPRRMxx_250901_ctp_qmr.dane/tests/custom-12_ndaysx100_D500km_lat0-sst300-O33Betts/run/dpxx.5minI.Davg_Betts.h.INSTANT.nmins_x5.2000-01-01-00000.nc |less
...
  _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, -1.029885, 
    -1.183011, -1.212783, -1.072223, _, _, _, _, _, -1.012082, -1.554305, 
    -2.61198, -3.101385, -3.104887, -3.375674, -3.497221, -3.148726, 
    -3.292871, -3.438457, -3.648735, -3.780627, -3.801865, -3.776839, 
    -3.674299, -3.577605, -3.454128, -3.411064, -3.563905, -3.481781, 
    -3.504468, -3.429755, -3.499566, -3.625915, -3.713715, -3.915647, 
    -3.659662, -2.84359, -2.500453, -2.558967, -2.39695, -2.578067, 
    -2.738998, -2.885796, -3.015455, -3.128233, -3.263574, -3.323847, 
    -3.27986, -3.178348, -3.143641, -3.043104, -2.870499, -2.725295, 
...

jsbamboo avatar Oct 28 '25 23:10 jsbamboo

A fix is already brewing...

bartgol avatar Oct 28 '25 23:10 bartgol

oh glad to hear that!!! so should I keep this issue open or close it?

jsbamboo avatar Oct 28 '25 23:10 jsbamboo

Keep it open. Once @mahf708 feels like putting the PR fix in, he'll link the PR to the issue, so it autocloses when merged.

bartgol avatar Oct 28 '25 23:10 bartgol

thanks for the guidance!

jsbamboo avatar Oct 28 '25 23:10 jsbamboo

as Luca reported, this is a pretty serious and sprawling bug that can affect all sorts of quantities that have fill-val treatment (cosp, subsampling, aodvis, fieldat, etc.). This will be fixed fairly soon

mahf708 avatar Oct 28 '25 23:10 mahf708