PyPRECIS
PyPRECIS copied to clipboard
Required fixes/updates to core CORDEX notebooks
When setting up the new AFR-22 notebooks, a number of issues/potential improvements to the core EAS-22 notebooks have been identified
errors
- [x] Inconsistent location for location of data (historical in WS3 but not WS2) #138
- [x] 4.2c solution is for pr not tm and also only historical not future as well #139
- [x] Wet days - using threshold of 0 not 1 -replace custom aggregator with count > 1 #140
Improvements
- [ ] S3 and rsync - need to have clarity on which using in standard and a subsidary one with other options #145
- [x] Future dir should be explicitly rcp85
- [x] Mean over season should be over time (this should remove redundant time dimension)
- [x] Inconsistent on use of REMO2015 in filenames
- [x] Needs to be made VERY clear when we expect something will fail (worksheet 1 - merge problem)
- [ ] follow approach of scitools/iris course for examples use e..g %load solutions/iris_exercise_1.2a #141
Simplify
- [ ] Remove slicing section from 2.1b
- [ ] Reduce the section on shell commands - wc -l not essential for example. Consider moving out of a notebook as well
- [ ] Add plot of country X with coastlines and zoom to notebook 1
- [x] Remove f strings and remaining .format (just use concatenation) - no need to learn this
- [ ] Remove tuples - just an extra thing to learn, all can be done with lists
For adaptability
- [x] set domain name as variable at start
- [ ] Always call data directory just data (population of this dir is part of the customisation per course depending on logistics)
- [ ] Variable for city name and location
to make dealing with all of these issues easier, I will start adding separate issues for each which also means I can expand the explanation
Queries about improvements
-
Inconsistent on use of REMO2015 in filenames
Please could you expand on this issue? Looking at the files in
data_v2/EAS-22,REMO2015files seem to follow the same naming format as non-REMO files to me? e.g.pr_EAS-22_MOHC-HadGEM2-ES_historical_r1i1p1_GERICS-REMO2015_v1_mon_198101-199012.ncpr_EAS-22_MOHC-HadGEM2-ES_historical_r1i1p1_ICTP-RegCM4-4_v0_mon_198101-199012.ncDiff:GERICS-REMO2015_v1vsICTP-RegCM4-4_v0 -
Future dir should be explicitly rcp85
data_v2/EAS-22contains two directories;futurecontains data,rcp85is empty. Do you think files & notebook paths should point torcp85instead offuture? Or thatrcp85should be included in the filenames? (Currently e.g.hadgem2-es.mon.2041_2060.GERICS-REMO2015.tm.C.nc)
Mean over season should be over time (this should remove redundant time dimension)
Swapped out aggregated(['seasons']) for collapsed('time') to reduce to a lat-lon 2D cube.
Applied at 2.3d, 2.3e, 2.4f (e.g. hagem_cube[0] --> hadgem_cube), and 4.1a.
Fixed in 36243c4 .
Two possible issues:
-
This method throws up a new UserWarning, is there anything we can do to prevent this?
UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'time'. warnings.warn(msg.format(self.name())) -
seasonsbecomes an ugly string of multiple seasons, rather than one (could affect plotting in future)Scalar coordinates: seasons: ond|ond|ond|ond|ond|ond|ond|ond|ond|ond|ond|ond|ond|ond|ond|ond|ond|on... time: 1996-05-16 00:00:00, bound=(1986-10-01 00:00:00, 2006-01-01 00:00:00)Could be replaced with a single 'OND' using e.g.
cube.coord('time').points[0] = cube.coord('time').points[0][:3]
Tested difference between methods, returned zero cube.
ond_agg = data_ond.aggregated_by(['seasons'], iris.analysis.MEAN) # old
ond_mean = data_ond.collapsed('time', iris.analysis.MEAN) # new
diff = ond_agg - ond_mean
print(f"diff cube: \n {diff.data.max()}")
>> 0.0
Test ran worksheets until 5.2. No further downstream impacts found.
#138 Inconsistent HISTDIR - fixed in 3f592f9. #139 4.2c solution for tm (hist & fut) - fixed in e0e96f4. #140 Wet days >1mm - fixed in 52fc291.
Inconsistent on use of REMO2015 in filenames
All filenames (in /project/ciid/projects/PRECIS/worksheets/data_v2) and references in worksheets 1-6 appear to consistently reference the full name GERICS-REMO2105. Nick thinks he fixed this as he went along in Angola so this item is now redundant.
Future dir should be explicitly rcp85
Addressed in 2d2f7b7.
- [ ] When reviewing & merging this change, someone who can log in as user:
ciidwill need to move data to retain functionality.
Please copy / move FUTURE data so worksheets 4-6 point to a directory containing data (FUTURE = 'data_v2/EAS/rcp85').
mv /project/ciid/projects/PRECIS/worksheets/data_v2/EAS-22/future/*.nc /project/ciid/projects/PRECIS/worksheets/data_v2/EAS-22/rcp85
rmdir /project/ciid/projects/PRECIS/worksheets/data_v2/EAS-22/future
Note: we may wish to explicitly include rcp85 in filenames (not done here).
Remove f strings and remaining .format
Addressed in ac3b1b4 & 93b82f8.
Make explicit that worksheet 1 merge is expected to fail
Addressed in 79e0d7c. Rephrased instruction & green 'task' box added.
set domain name as variable at start
Workbooks 1-6 + solutions. Addressed in e48b1f2..f39ff5c Also updated time-periods key from 'future' to 'rcp85' following 2d2f7b7.
as the above PR only fixes some of the issues, reopening this so we still have a record of the work to be done