cf-xarray icon indicating copy to clipboard operation
cf-xarray copied to clipboard

Statistics over dimension does not remove bounds variable

Open BorjaEst opened this issue 2 years ago • 2 comments

Context

Given the following example dataset:

$ ncdump -h toz.CF-1.8.nc 
netcdf toz.CF-1.8 {
dimensions:
        time = 12 ;
        bounds2 = 2 ;
        lat = 18 ;
        lon = 36 ;
variables:
        double time_bnds(time, bounds2) ;
        double time(time) ;
                time:standard_name = "time" ;
                time:units = "days since 2000-1-1" ;
                time:calendar = "gregorian" ;
                time:bounds = "time_bnds" ;
        double lat_bnds(lat, bounds2) ;
        double lat(lat) ;
                lat:standard_name = "latitude" ;
                lat:units = "degrees_north" ;
                lat:bounds = "lat_bnds" ;
        double lon_bnds(lon, bounds2) ;
        double lon(lon) ;
                lon:standard_name = "longitude" ;
                lon:units = "degrees_east" ;
                lon:bounds = "lon_bnds" ;
        double toz(time, lat, lon) ;
                toz:standard_name = "atmosphere_mole_content_of_ozone" ;
                toz:long_name = "Total Ozone Column" ;
                toz:units = "mol m-2" ;
                toz:cell_methods = "area: mean time: maximum (interval: 1 day)" ;

// global attributes:
                :Conventions = "CF-1.8" ;
                :title = "Ozone data mockup" ;
                :comment = "Ozone mockup data generated for testing purposes" ;
                :institution = "o3skim code" ;
                :history = "File created on 2022-02-17 10:14:04.458982" ;
                :source = "Random generation" ;
}

Which is correct:

$ cfchecks -v 1.8 toz.CF-1.8.nc 
CHECKING NetCDF FILE: toz.CF-1.8.nc
=====================
Using CF Checker Version 4.1.0
Checking against CF Version CF-1.8
Using Standard Name Table Version 78 (2021-09-21T11:55:06Z)
Using Area Type Table Version 10 (23 June 2020)
Using Standardized Region Name Table Version 4 (18 December 2018)


------------------
Checking variable: time_bnds
------------------

------------------
Checking variable: time
------------------

------------------
Checking variable: lat_bnds
------------------

------------------
Checking variable: lat
------------------

------------------
Checking variable: lon_bnds
------------------

------------------
Checking variable: lon
------------------

------------------
Checking variable: toz
------------------

ERRORS detected: 0
WARNINGS given: 0
INFORMATION messages: 0

Issue

Does not remove the coordinate bounds applying an operation, for example mean:

>>> ds = xr.load_dataset("toz.CF-1.8.nc")
>>> ds.cf.mean("latitude").to_netcdf("toz-reduced.CF-1.8.nc")

You can see the variable lat_bnds is still present:

$ ncdump -h toz-reduced.CF-1.8.nc
netcdf toz-reduced.CF-1.8 {
dimensions:
        time = 12 ;
        bounds2 = 2 ;
        lon = 36 ;
variables:
        int64 time_bnds(time, bounds2) ;
        double time(time) ;
                time:_FillValue = NaN ;
                time:standard_name = "time" ;
                time:bounds = "time_bnds" ;
                time:units = "days since 2000-01-01" ;
                time:calendar = "gregorian" ;
        double lat_bnds(bounds2) ;
                lat_bnds:_FillValue = NaN ;
        double lon_bnds(lon, bounds2) ;
                lon_bnds:_FillValue = NaN ;
        double lon(lon) ;
                lon:_FillValue = NaN ;
                lon:standard_name = "longitude" ;
                lon:units = "degrees_east" ;
                lon:bounds = "lon_bnds" ;
        double toz(time, lon) ;
                toz:_FillValue = NaN ;
                toz:standard_name = "atmosphere_mole_content_of_ozone" ;
                toz:long_name = "Total Ozone Column" ;
                toz:units = "mol m-2" ;
                toz:cell_methods = "area: mean time: maximum (interval: 1 day)" ;

// global attributes:
                :Conventions = "CF-1.8" ;
                :title = "Ozone data mockup" ;
                :comment = "Ozone mockup data generated for testing purposes" ;
                :institution = "o3skim code" ;
                :history = "File created on 2022-02-17 10:14:04.458982" ;
                :source = "Random generation" ;
}

Aditional information

Here the cf_xarray version information:

$ pip show cf_xarray
Name: cf-xarray
Version: 0.6.1
Summary: A lightweight convenience wrapper for using CF attributes on xarray objects. 
Home-page: https://cf-xarray.readthedocs.io
Author: None
Author-email: None
License: Apache
Location: /home/borja/miniconda3/envs/o3as/lib/python3.8/site-packages
Requires: pandas, numpy, setuptools
Required-by: 

BorjaEst avatar Mar 09 '22 12:03 BorjaEst

Thanks @BorjaEst . I'm not sure this is easy to fix currently but I'll take a look.

In the medium term I'm planning on decoding "bounds" variables to a pandas.IntervalIndex which would fix this automatically.

dcherian avatar Mar 09 '22 15:03 dcherian

@dcherian, you are welcome. Not sure if it helps, but normally I clean by just dropping the bound after the opeation:. That cleans the warnings:

>>> ds.cf.mean('latitude')
>>> ds = ds.drop("lat_bnds")  # Del unused bounds
>>> for var in ds.variables:  # Clean _FillValue = NaN 
...     ds[var].attrs['_FillValue'] = False
>>> ds.to_netcdf("toz-reduced-2.CF-1.8.nc")
$ cfchecks -v 1.8 toz-reduced-2.CF-1.8.nc 
CHECKING NetCDF FILE: toz-reduced-2.CF-1.8.nc
=====================
Using CF Checker Version 4.1.0
Checking against CF Version CF-1.8
Using Standard Name Table Version 78 (2021-09-21T11:55:06Z)
Using Area Type Table Version 10 (23 June 2020)
Using Standardized Region Name Table Version 4 (18 December 2018)

ERROR: (7.1): boundary variable with non-numeric data type

------------------
Checking variable: time_bnds
------------------

------------------
Checking variable: time
------------------

------------------
Checking variable: lon_bnds
------------------

------------------
Checking variable: lon
------------------

------------------
Checking variable: toz
------------------

ERRORS detected: 1
WARNINGS given: 0
INFORMATION messages: 0

Ignore the error as it is related to #310

BorjaEst avatar Mar 10 '22 06:03 BorjaEst