xmitgcm icon indicating copy to clipboard operation
xmitgcm copied to clipboard

open_mdsdataset dimension error

Open ruth-moorman opened this issue 3 years ago • 8 comments

Hello! I'm having an issue loading 2D fields from an LLC270 run. All 3D variables are loading as expected but the 2D fields are giving the error: ValueError: dimensions ('time', 'j', 'i') must have the same length as the number of data dimensions, ndim=2

The .meta files one these 2D fields look equivalent to that of other runs I have not had issues loading variables from, e.g.,

 dimList = [
  1080,    1, 1080,
   310,    1,  310
 ];
 dataprec = [ 'float32' ];
 nrecords = [         53 ];
 timeStepNumber = [       8640 ];
 timeInterval = [  7.905600000000E+06  1.036800000000E+07 ];
 missingValue = [ -9.99000000000000E+02 ];
 nFlds = [   53 ];
 fldList = {
 'ETAN    ' 'SIarea  ' 'SIheff  ' 'SIhsnow ' 'SItices ' 'SIhsalt ' 'SIuice  ' 'SIvice  ' 'SHIfwFlx' 'SHIhtFlx' 'SHI_TauX' 'SHI_TauY' 'DETADT2 ' 'PHIBOT  ' 'sIceLoad' 'MXLDEPTH' 'oceSPDep' 'SIatmQnt' 'SIatmFW ' 'oceQnet '
 'oceFWflx' 'oceTAUX ' 'oceTAUY ' 'oceSflux' 'TFLUX   ' 'SFLUX   ' 'EXFtaux ' 'EXFtauy ' 'EXFlwnet' 'EXFswnet' 'EXFswdn ' 'EXFlwdn ' 'EXFqnet ' 'EXFhs   ' 'EXFhl   ' 'EXFevap ' 'EXFpreci' 'EXFatemp' 'SIqnet  ' 'SIqsw   '
 'SIatmQnt' 'SItflux ' 'SIaaflux' 'SIhl    ' 'SIqneto ' 'SIqneti ' 'SIempmr ' 'SIatmFW ' 'SIsnPrcp' 'SIactLHF' 'SIacSubl' 'botTauX ' 'botTauY '
 };
state_2d_set1.0000008640.meta (END)

so I am at a bit of a loss as to the issue. I've checked with the person who generated the data there should be only one timestamp (monthly) per .data file. Could someone help me understand where the dimensions=('time','j','i') information is sourced from and whether there is a workaround that can prevent this clash?

ruth-moorman avatar Nov 28 '22 19:11 ruth-moorman

Can you share the code you are using to open the data?

rabernat avatar Nov 28 '22 19:11 rabernat

Sure, it's come up with a few iterations on the basic open_mdsdataset call including just the basic state_2d = open_mdsdataset(rootdir+'state_2d_set1/') and including time info, for example, state_2d = open_mdsdataset(rootdir+'state_2d_set1/',delta_t = 1200, ref_date='1991-12-15 0:0:0)

ruth-moorman avatar Nov 28 '22 20:11 ruth-moorman

Hi all, I'm having the same issue. If someone has the solution, I'd appreciate hearing it! :)

lily-dove avatar Feb 24 '23 20:02 lily-dove

Hi @ruth-moorman, does it work if you add the arguments geometry="llc", nx=270?

timothyas avatar Mar 01 '23 03:03 timothyas

Hiya @timothyas sorry for the weird delay here, I ended up not working with that output but am now having the same issue with different output from an LLC540 configuration. Again, the issue is only occurring with 2D variables. In this case I know I should be using geometry = 'curvilinear' and am (and, again, works for 3d variables).

So for example I'm calling:

ds = xmitgcm.open_mdsdataset('../llc540_notides_cycle2/results/diags/', grid_dir = '../llc540_notides_cycle2/results/',prefix = ['state_2d_set1'], geometry='curvilinear',delta_t=480, ref_date = '1993-1-1 0:0:0',iters=iterations[0])

and getting

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[11], line 2
      1 # however in this notebook I'm mostly concerned with understanding the bathymetry, so I'll just infile and name an iter0 dataset (kind of a dummy,
----> 2 ds = xmitgcm.open_mdsdataset(llc540_dir, grid_dir = llc540_dir_grid,prefix = ['state_2d_set1'], geometry=geometry,delta_t=delta_t, ref_date = ref_date,iters=iterations[0])
      3 # ds = add_latlon(ds)
      4 # grid = xgcm.Grid(ds, periodic='X')
      5 # ds

File ~/miniforge3/envs/pangeo/lib/python3.11/site-packages/xmitgcm/mds_store.py:273, in open_mdsdataset(data_dir, grid_dir, iters, prefix, read_grid, delta_t, ref_date, calendar, levels, geometry, grid_vars_to_coords, swap_dims, endian, chunks, ignore_unknown_vars, default_dtype, nx, ny, nz, llc_method, extra_metadata, extra_variables)
    270                 ds = _set_coords(ds)
    271             return ds
--> 273 store = _MDSDataStore(data_dir, grid_dir, iternum, delta_t, read_grid,
    274                       prefix, ref_date, calendar,
    275                       geometry, endian,
    276                       ignore_unknown_vars=ignore_unknown_vars,
    277                       default_dtype=default_dtype,
    278                       nx=nx, ny=ny, nz=nz, llc_method=llc_method,
    279                       levels=levels, extra_metadata=extra_metadata,
    280                      extra_variables=extra_variables)
    282 ds = xr.Dataset.load_store(store)
    283 if swap_dims:

File ~/miniforge3/envs/pangeo/lib/python3.11/site-packages/xmitgcm/mds_store.py:596, in _MDSDataStore.__init__(self, data_dir, grid_dir, iternum, delta_t, read_grid, file_prefixes, ref_date, calendar, geometry, endian, ignore_unknown_vars, default_dtype, nx, ny, nz, llc_method, levels, extra_metadata, extra_variables)
    593 # Create masks from hFac variables
    594 data = self.calc_masks(vname, data)
--> 596 thisvar = xr.Variable(dims, data, attrs)
    597 self._variables[vname] = thisvar

File ~/miniforge3/envs/pangeo/lib/python3.11/site-packages/xarray/core/variable.py:367, in Variable.__init__(self, dims, data, attrs, encoding, fastpath)
    347 """
    348 Parameters
    349 ----------
   (...)
    364     unrecognized encoding items.
    365 """
    366 self._data = as_compatible_data(data, fastpath=fastpath)
--> 367 self._dims = self._parse_dimensions(dims)
    368 self._attrs = None
    369 self._encoding = None

File ~/miniforge3/envs/pangeo/lib/python3.11/site-packages/xarray/core/variable.py:683, in Variable._parse_dimensions(self, dims)
    681     dims = tuple(dims)
    682 if len(dims) != self.ndim:
--> 683     raise ValueError(
    684         f"dimensions {dims} must have the same length as the "
    685         f"number of data dimensions, ndim={self.ndim}"
    686     )
    687 return dims

ValueError: dimensions ('time', 'j', 'i') must have the same length as the number of data dimensions, ndim=2

ruth-moorman avatar Sep 12 '23 23:09 ruth-moorman

Hi @ruth-moorman, does it work to either not specify iters, or specify iters=[iterations[0]]? The type specification for the iters argument is a list, so this could be it. That's just a guess though...

timothyas avatar Sep 21 '23 15:09 timothyas

@timothyas thanks for the suggestion but it doesn't look like it's the iters. iters=iterations[0], iters='all, no iters input, and iters=[iterations[0]] give the same error for the 2D fields. Just stressing in case it helps that I do not get this error with 3D fields for any of those listed values of iters.

i.e. this: xmitgcm.open_mdsdataset(llc540_dir, grid_dir = llc540_dir_grid,prefix = ['layers_3d_set2','fluxes_3d_set1','trsp_3d_set1','state_3d_set1'], geometry=geometry, delta_t=delta_t, ref_date = ref_date,iters=iterations[0]) works totally fine

ruth-moorman avatar Oct 03 '23 18:10 ruth-moorman

Hi @ruth-moorman, too bad that wasn't the issue. I'm not really sure what's going on. I cannot reproduce the error using the curvilinear_leman dataset in xmitgcm's test suite. If there's any way you can make the data public, I'd be happy to help you out further. I'm also curious how/why you are using a curvilinear geometry with the llc540 geometry - is the entire model domain on just one of the llc faces?

timothyas avatar Oct 11 '23 15:10 timothyas