PyPRECIS icon indicating copy to clipboard operation
PyPRECIS copied to clipboard

Required fixes/updates to core CORDEX notebooks

Open nhsavage opened this issue 3 years ago • 10 comments

When setting up the new AFR-22 notebooks, a number of issues/potential improvements to the core EAS-22 notebooks have been identified

errors

  • [x] Inconsistent location for location of data (historical in WS3 but not WS2) #138
  • [x] 4.2c solution is for pr not tm and also only historical not future as well #139
  • [x] Wet days - using threshold of 0 not 1 -replace custom aggregator with count > 1 #140

Improvements

  • [ ] S3 and rsync - need to have clarity on which using in standard and a subsidary one with other options #145
  • [x] Future dir should be explicitly rcp85
  • [x] Mean over season should be over time (this should remove redundant time dimension)
  • [x] Inconsistent on use of REMO2015 in filenames
  • [x] Needs to be made VERY clear when we expect something will fail (worksheet 1 - merge problem)
  • [ ] follow approach of scitools/iris course for examples use e..g %load solutions/iris_exercise_1.2a #141

Simplify

  • [ ] Remove slicing section from 2.1b
  • [ ] Reduce the section on shell commands - wc -l not essential for example. Consider moving out of a notebook as well
  • [ ] Add plot of country X with coastlines and zoom to notebook 1
  • [x] Remove f strings and remaining .format (just use concatenation) - no need to learn this
  • [ ] Remove tuples - just an extra thing to learn, all can be done with lists

For adaptability

  • [x] set domain name as variable at start
  • [ ] Always call data directory just data (population of this dir is part of the customisation per course depending on logistics)
  • [ ] Variable for city name and location

nhsavage avatar May 09 '22 10:05 nhsavage

to make dealing with all of these issues easier, I will start adding separate issues for each which also means I can expand the explanation

nhsavage avatar Jul 13 '22 08:07 nhsavage

Queries about improvements

  • Inconsistent on use of REMO2015 in filenames

    Please could you expand on this issue? Looking at the files in data_v2/EAS-22, REMO2015 files seem to follow the same naming format as non-REMO files to me? e.g. pr_EAS-22_MOHC-HadGEM2-ES_historical_r1i1p1_GERICS-REMO2015_v1_mon_198101-199012.nc pr_EAS-22_MOHC-HadGEM2-ES_historical_r1i1p1_ICTP-RegCM4-4_v0_mon_198101-199012.nc Diff: GERICS-REMO2015_v1 vs ICTP-RegCM4-4_v0

  • Future dir should be explicitly rcp85

    data_v2/EAS-22 contains two directories; future contains data, rcp85 is empty. Do you think files & notebook paths should point to rcp85 instead of future? Or that rcp85 should be included in the filenames? (Currently e.g. hadgem2-es.mon.2041_2060.GERICS-REMO2015.tm.C.nc)

rosannaamato avatar Aug 12 '22 12:08 rosannaamato

Mean over season should be over time (this should remove redundant time dimension)

Swapped out aggregated(['seasons']) for collapsed('time') to reduce to a lat-lon 2D cube. Applied at 2.3d, 2.3e, 2.4f (e.g. hagem_cube[0] --> hadgem_cube), and 4.1a. Fixed in 36243c4 .

Two possible issues:

  1. This method throws up a new UserWarning, is there anything we can do to prevent this? UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'time'. warnings.warn(msg.format(self.name()))

  2. seasons becomes an ugly string of multiple seasons, rather than one (could affect plotting in future)

    Scalar coordinates:
           seasons: ond|ond|ond|ond|ond|ond|ond|ond|ond|ond|ond|ond|ond|ond|ond|ond|ond|on...
           time: 1996-05-16 00:00:00, bound=(1986-10-01 00:00:00, 2006-01-01 00:00:00)
    

    Could be replaced with a single 'OND' using e.g. cube.coord('time').points[0] = cube.coord('time').points[0][:3]

Tested difference between methods, returned zero cube.

ond_agg =  data_ond.aggregated_by(['seasons'], iris.analysis.MEAN)    # old
ond_mean = data_ond.collapsed('time', iris.analysis.MEAN)             # new
diff = ond_agg - ond_mean
print(f"diff cube: \n {diff.data.max()}")
>> 0.0

Test ran worksheets until 5.2. No further downstream impacts found.

rosannaamato avatar Aug 12 '22 16:08 rosannaamato

#138 Inconsistent HISTDIR - fixed in 3f592f9. #139 4.2c solution for tm (hist & fut) - fixed in e0e96f4. #140 Wet days >1mm - fixed in 52fc291.

rosannaamato avatar Aug 12 '22 16:08 rosannaamato

Inconsistent on use of REMO2015 in filenames

All filenames (in /project/ciid/projects/PRECIS/worksheets/data_v2) and references in worksheets 1-6 appear to consistently reference the full name GERICS-REMO2105. Nick thinks he fixed this as he went along in Angola so this item is now redundant.

rosannaamato avatar Sep 05 '22 10:09 rosannaamato

Future dir should be explicitly rcp85

Addressed in 2d2f7b7.

  • [ ] When reviewing & merging this change, someone who can log in as user: ciid will need to move data to retain functionality.

Please copy / move FUTURE data so worksheets 4-6 point to a directory containing data (FUTURE = 'data_v2/EAS/rcp85').

mv /project/ciid/projects/PRECIS/worksheets/data_v2/EAS-22/future/*.nc /project/ciid/projects/PRECIS/worksheets/data_v2/EAS-22/rcp85 rmdir /project/ciid/projects/PRECIS/worksheets/data_v2/EAS-22/future

Note: we may wish to explicitly include rcp85 in filenames (not done here).

rosannaamato avatar Sep 05 '22 11:09 rosannaamato

Remove f strings and remaining .format

Addressed in ac3b1b4 & 93b82f8.

rosannaamato avatar Sep 05 '22 14:09 rosannaamato

Make explicit that worksheet 1 merge is expected to fail

Addressed in 79e0d7c. Rephrased instruction & green 'task' box added.

rosannaamato avatar Sep 05 '22 15:09 rosannaamato

set domain name as variable at start

Workbooks 1-6 + solutions. Addressed in e48b1f2..f39ff5c Also updated time-periods key from 'future' to 'rcp85' following 2d2f7b7.

rosannaamato avatar Sep 06 '22 09:09 rosannaamato

as the above PR only fixes some of the issues, reopening this so we still have a record of the work to be done

nhsavage avatar Sep 06 '22 14:09 nhsavage