Replace OPeNDAP datasets with Xarray tutorial datasets in docs
Description
- Closes #277
- Closes #675
Checklist
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my own code
- [ ] My changes generate no new warnings
- [ ] Any dependent changes have been merged and published in downstream modules
If applicable:
- [ ] I have added tests that prove my fix is effective or that my feature works
- [ ] New and existing unit tests pass with my changes (locally and CI/CD build)
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [ ] I have noted that this is a breaking change for a major release (fix or feature that would cause existing functionality to not work as expected)
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 100.00%. Comparing base (
c52b5a7) to head (b8b200a). Report is 1 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #705 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 15 16 +1
Lines 1621 1658 +37
=========================================
+ Hits 1621 1658 +37
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
🚀 New features to boost your workflow:
- ❄ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
For some of these examples, we probably need to host some ESGF datasets in a xcdat-data repo, similar to https://github.com/pydata/xarray-data. The datasets at xarray-data are subsetted on lat/lon, which means I can't plot a global color map. Plots are looking weird and generating dummy datasets in-memory is not that simple (e.g., getting realistic tas data in a numpy array).
The added benefit of this approach is that we can use real-world datasets and it can help standardize our approach to testing.
My proposed solution
- [x] 1. Get the list of datasets used in the notebooks -- figure out which ones overlap between notebooks.
# Gentle Introduction
* "https://esgf-data1.llnl.gov/thredds/dodsC/css03_data/CMIP6/CMIP/CSIRO/ACCESS-ESM1-5/historical/r10i1p1f1/Amon/tas/gn/v20200605/tas_Amon_ACCESS-ESM1-5_historical_r10i1p1f1_gn_185001-201412.nc"
# xCDAT utilities
* "https://esgf-data2.llnl.gov/thredds/dodsC/user_pub_work/E3SM/1_0/amip_1850_aeroF/1deg_atm_60-30km_ocean/atmos/180x360/time-series/mon/ens2/v3/TS_187001_189412.nc"
* "https://esgf-data2.llnl.gov/thredds/dodsC/user_pub_work/E3SM/1_0/amip_1850_aeroF/1deg_atm_60-30km_ocean/atmos/180x360/time-series/mon/ens2/v3/TS_189501_191912.nc",
# Spatial Averaging
* "https://esgf-data1.llnl.gov/thredds/dodsC/css03_data/CMIP6/CMIP/CSIRO/ACCESS-ESM1-5/historical/r10i1p1f1/Amon/tas/gn/v20200605/tas_Amon_ACCESS-ESM1-5_historical_r10i1p1f1_gn_185001-201412.nc"
* "https://esgf-data1.llnl.gov/thredds/dodsC/css03_data/CMIP6/CMIP/CSIRO/ACCESS-ESM1-5/historical/r10i1p1f1/Amon/pr/gn/v20200605/pr_Amon_ACCESS-ESM1-5_historical_r10i1p1f1_gn_185001-201412.nc"
# Temporal Averaging
* "https://esgf-data1.llnl.gov/thredds/dodsC/css03_data/CMIP6/CMIP/CSIRO/ACCESS-ESM1-5/historical/r10i1p1f1/Amon/tas/gn/v20200605/tas_Amon_ACCESS-ESM1-5_historical_r10i1p1f1_gn_185001-201412.nc"
* "https://esgf-data1.llnl.gov/thredds/dodsC/css03_data/CMIP6/CMIP/CSIRO/ACCESS-ESM1-5/historical/r10i1p1f1/3hr/tas/gn/v20200605/tas_3hr_ACCESS-ESM1-5_historical_r10i1p1f1_gn_201001010300-201501010000.nc"
# Climatologies and departures
* "http://esgf.nci.org.au/thredds/dodsC/master/CMIP6/CMIP/CSIRO/ACCESS-ESM1-5/historical/r10i1p1f1/Amon/tas/gn/v20200605/tas_Amon_ACCESS-ESM1-5_historical_r10i1p1f1_gn_185001-201412.nc"
# This dataset should not be downloaded. We can subset
* "http://esgf.nci.org.au/thredds/dodsC/master/CMIP6/CMIP/CSIRO/ACCESS-ESM1-5/historical/r10i1p1f1/3hr/tas/gn/v20200605/tas_3hr_ACCESS-ESM1-5_historical_r10i1p1f1_gn_201001010300-201501010000.nc"
# Horizontal regridding
* "http://aims3.llnl.gov/thredds/dodsC/css03_data/CMIP6/CMIP/CCCma/CanESM5/historical/r13i1p1f1/Amon/tas/gn/v20190429/tas_Amon_CanESM5_historical_r13i1p1f1_gn_185001-201412.nc"
* "http://aims3.llnl.gov/thredds/dodsC/css03_data/CMIP6/CMIP/NOAA-GFDL/GFDL-CM4/abrupt-4xCO2/r1i1p1f1/day/tas/gr2/v20180701/tas_day_GFDL-CM4_abrupt-4xCO2_r1i1p1f1_gr2_00010101-00201231.nc"
# Vertical regridding
* "http://aims3.llnl.gov/thredds/dodsC/css03_data/CMIP6/CMIP/NCAR/CESM2/historical/r1i1p1f1/Omon/so/gn/v20190308/so_Omon_CESM2_historical_r1i1p1f1_gn_185001-201412.nc",
* "http://aims3.llnl.gov/thredds/dodsC/css03_data/CMIP6/CMIP/NCAR/CESM2/historical/r1i1p1f1/Omon/thetao/gn/v20190308/thetao_Omon_CESM2_historical_r1i1p1f1_gn_185001-201412.nc",
* "http://aims3.llnl.gov/thredds/dodsC/css03_data/CMIP6/CMIP/NOAA-GFDL/GFDL-CM4/abrupt-4xCO2/r1i1p1f1/day/tas/gr2/v20180701/tas_day_GFDL-CM4_abrupt-4xCO2_r1i1p1f1_gr2_00010101-00201231.nc"
- [x] 2. Host those following datasets on xcdat-data -- subsetted on time to minimize size < 100 mb per file (maybe 3-5 years?)
- [x] 3. Update
xc.tutorial.open_dataset()with paths to these files - [x] 4. Update Jupyter Notebook examples. -- IN PROGRESS
@tomvothecoder In my very quick glimpse I don't see any obviously noticeable issues! Notebooks are looking good to me. It's great to leverage xarray's sample datasets so we don't have to maintain our own. Thank you for your work for this PR!
@tomvothecoder In my very quick glimpse I don't see any obviously noticeable issues! Notebooks are looking good to me. It's great to leverage xarray's sample datasets so we don't have to maintain our own. Thank you for your work for this PR!
Thanks for the review @lee1043! I actually decided to create xCDAT sample datasets (https://github.com/xCDAT/xcdat-data) which contain the same ESGF datasets but subsetted. This allows us to keep the same examples in the notebook. I found using the xarray sample datasets resulted in more significant changes in the notebook.
@tomvothecoder if maintaining our own sample dataset is not a huge effort, I am not oppose on that. Thanks a lot!