argopy
argopy copied to clipboard
Xarray backend to open Argo Netcdf files
Close #176
When openning an Argo netcdf file with xarray, most of the variables are not decoded properly and returned as objects. For a regular core-Argo profile file, 48 out of 64 data variables are not decoded as they should.
This PR tries to implement a new Xarray backend to open Argo Netcdf files where all the variables will be "casted" correctly.
We should be able to open any of the reference Argo netcdf files:
- [ ] Core-Argo individual profile files (<R/D><FloatWmoID>_<XXX><D>.nc). The core-Argo profile files contain the core parameters provided by a float: pressure, temperature, salinity, conductivity (PRES, TEMP, PSAL, CNDC).
- [ ] B-Argo individual profile file (B<R/D><FloatWmoID>_<XXX><D>.nc). A B-Argo profile file contains all the parameters from a float, except the core-Argo parameters temperature, salinity, conductivity (TEMP, PSAL, CNDC). A float that performs only CTD measurements does not have B-Argo data files.
- [ ] BGC-Argo individual synthetic profile file (M<R/D><FloatWmoID>_<XXX><D>.nc). The synthetic file contains the core-Argo and BGC-Argo parameters listed on reference table 3. The intermediate parameters are ignored by the synthetic files.
- [ ] Argo trajectory data file (<FloatWmoID>_<R/D>traj.nc). The Argo trajectory files contain the core and BGC parameters provided by a float.
- [ ] Metadata file (<FloatWmoID>_meta.nc).
- [ ] Technical Data file (<FloatWmoID>_tech.nc).
And obviously:
- [ ] Core-Argo multiple profile files
This pull request was marked as staled automatically because it has not seen any activity in 90 days
This pull request was marked as staled automatically because it has not seen any activity in 90 days
I tested the 'argo' engine on a collection of sample files covering all of 7 Argo data types, and it works !
OK for Argo profile (test file: /dac/bodc/6901929/6901929_prof.nc)
> 46/64 variables not casted otherwise !
OK for Argo profile (test file: /dac/meds/4901079/profiles/D4901079_110.nc)
> 46/64 variables not casted otherwise !
OK for Argo profile (test file: /dac/aoml/13857/profiles/R13857_001.nc)
> 43/58 variables not casted otherwise !
OK for Argo trajectory (test file: /dac/aoml/5900446/5900446_Dtraj.nc)
> 59/102 variables not casted otherwise !
OK for Argo trajectory (test file: /dac/csio/2902696/2902696_Rtraj.nc)
> 58/102 variables not casted otherwise !
OK for Argo synthetic profile (test file: /dac/coriolis/3902131/3902131_Sprof.nc)
> 37/58 variables not casted otherwise !
OK for Argo synthetic profile (test file: /dac/coriolis/3902131/profiles/SD3902131_001.nc)
> 37/58 variables not casted otherwise !
OK for Argo synthetic profile (test file: /dac/coriolis/3902131/profiles/SD3902131_001D.nc)
> 37/58 variables not casted otherwise !
OK for Argo synthetic profile (test file: /dac/coriolis/6903247/profiles/SR6903247_134.nc)
> 61/114 variables not casted otherwise !
OK for Argo synthetic profile (test file: /dac/coriolis/6903247/profiles/SR6903247_134D.nc)
> 37/58 variables not casted otherwise !
OK for B-Argo profile (test file: /dac/coriolis/3902131/profiles/BR3902131_001.nc)
> 47/63 variables not casted otherwise !
OK for B-Argo profile (test file: /dac/coriolis/3902131/profiles/BR3902131_001D.nc)
> 47/63 variables not casted otherwise !
OK for B-Argo trajectory (test file: /dac/coriolis/3902131/3902131_BRtraj.nc)
> 40/63 variables not casted otherwise !
OK for B-Argo trajectory (test file: /dac/coriolis/6903247/6903247_BRtraj.nc)
> 70/131 variables not casted otherwise !
OK for Argo technical data (test file: /dac/incois/2902269/2902269_tech.nc)
> 9/10 variables not casted otherwise !
OK for Argo technical data (test file: /dac/nmdis/2901623/2901623_tech.nc)
> 9/10 variables not casted otherwise !
OK for Argo meta-data (test file: /dac/jma/4902252/4902252_meta.nc)
> 60/65 variables not casted otherwise !
OK for Argo meta-data (test file: /dac/coriolis/1900857/1900857_meta.nc)
> 60/65 variables not casted otherwise !
The API is quite simple !
ds = xr.open_dataset(file, engine='argo')
I checked if 100% of the variables are casted (as string, int, float or datetime) and none is returned as an "object".
"requests" module is missing from requirements when installing from scratch.
Warning when casting datetime :
argopy/xarray.py:150: UserWarning: Converting non-nanosecond precision datetime values to nanosecond precision. This behavior can eventually be relaxed in xarray, as it is an artifact from pandas which is now beginning to support non-nanosecond precision values. This warning is caused by passing non-nanosecond np.datetime64 or np.timedelta64 values to the DataArray or Variable constructor; it can be silenced by converting the values to nanosecond precision ahead of time.
da = da.astype(type)
Warning when casting datetime
Could you please give me the output of:
argopy.show_versions()
Warning when casting datetime
Could you please give me the output of:
argopy.show_versions()
SYSTEM
------
commit: None
python: 3.9.16 (main, Mar 8 2023, 14:00:05)
[GCC 11.2.0]
python-bits: 64
OS: Linux
OS-release: 4.15.0-211-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: fr_FR.UTF-8
LOCALE: fr_FR.UTF-8
libhdf5: 1.12.2
libnetcdf: 4.9.1
INSTALLED VERSIONS: CORE
------------------------
aiohttp : 3.8.4
argopy : 0.1.13
erddapy : 2.0.1
fsspec : 2023.5.0
netCDF4 : 1.6.3
packaging : 23.1
scipy : 1.10.1
toolz : 0.12.0
xarray : 2023.5.0
INSTALLED VERSIONS: EXT.UTIL
----------------------------
gsw : -
tqdm : -
zarr : -
INSTALLED VERSIONS: EXT.PERF
----------------------------
dask : -
distributed : -
pyarrow : -
INSTALLED VERSIONS: EXT.PLOT
----------------------------
IPython : 8.13.2
cartopy : -
ipykernel : 6.23.1
ipywidgets : -
matplotlib : -
seaborn : -
INSTALLED VERSIONS: DEV
-----------------------
black : -
bottleneck : -
cfgrib : -
cftime : 1.6.2
conda : -
flake8 : -
nc_time_axis: -
numpy : 1.24.3
pandas : 2.0.1
pip : 23.0.1
pytest : -
pytest_cov : -
pytest_env : -
pytest_localftpserver: -
setuptools : 66.0.0
sphinx : -
ok, the warning comes from the latest Pandas 2.0 that raises a lot of new warnings I'll fix this in another PR