[REVIEW]: The University of Toronto Climate Downscaling Workflow: Tools and Resources for Climate Change Impact Analysis
Submitting author: @mikemorris12 (Michael Morris) Repository: https://github.com/mikemorris12/UTCDW_Guidebook/ Branch with paper.md (empty if default branch): master Version: v1.1.1 Editor: @acocac Reviewers: @aulemahal, @brian-rose Archive: 10.5281/zenodo.12785645 Paper kind: learning module
Status
Status badge code:
HTML: <a href="https://jose.theoj.org/papers/2c5db4cc1b1668693b3aabb62501fe58"><img src="https://jose.theoj.org/papers/2c5db4cc1b1668693b3aabb62501fe58/status.svg"></a>
Markdown: [](https://jose.theoj.org/papers/2c5db4cc1b1668693b3aabb62501fe58)
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@aulemahal & @brian-rose, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:
@editorialbot generate my checklist
The reviewer guidelines are available here: https://openjournals.readthedocs.io/en/jose/reviewer_guidelines.html. Any questions/concerns please let @acocac know.
✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨
Checklists
Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.
For a list of things I can do to help you, just type:
@editorialbot commands
For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:
@editorialbot generate pdf
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):
OK DOIs
- 10.1017/9781108601269 is OK
- 10.21105/jose.00100 is OK
- 10.1017/9781107588783 is OK
- 10.5194/gmd-9-1937-2016 is OK
MISSING DOIs
- No DOI given, and none found for title: XClim Official Documentation
INVALID DOIs
- None
Software report:
github.com/AlDanial/cloc v 1.90 T=0.42 s (576.3 files/s, 421510.0 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
HTML 41 6668 288 27461
Jupyter Notebook 83 0 88840 17821
JavaScript 21 4573 4489 16561
CSS 12 811 115 2904
PO File 45 1007 0 2284
YAML 6 8 6 1203
TeX 2 32 0 494
Markdown 25 316 0 447
Python 2 119 281 250
SVG 5 0 1 29
-------------------------------------------------------------------------------
SUM: 242 13534 94020 69454
-------------------------------------------------------------------------------
Commit count by author:
156 mike-morris-codes
15 Michale Morris
Paper file info:
📄 Wordcount for paper.md is 1069
✅ The paper includes a Statement of need section
License info:
🔴 License found: Creative Commons Attribution Share Alike 4.0 International (Not OSI approved)
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
👋 @aulemahal @brian-rose we will conduct the review in this issue.
Please read through the above information and let me know if you have any questions about the review process.
Thank you 🙏
Review checklist for @aulemahal
Conflict of interest
- [x] As the reviewer I confirm that I have read the JOSE conflict of interest policy and that there are no conflicts of interest for me to review this work.
Code of Conduct
- [x] I confirm that I read and will adhere to the JOSE code of conduct.
General checks
- [x] Repository: Is the source for this learning module available at the https://github.com/mikemorris12/UTCDW_Guidebook/?
- [x] License: Does the repository contain a plain-text LICENSE file with the contents of a standard license? (OSI-approved for code, Creative Commons for content)
- [x] Version: Does the release version given match the repository release?
- [x] Authorship: Has the submitting author (@mikemorris12) made visible contributions to the module? Does the full list of authors seem appropriate and complete?
Documentation
- [x] A statement of need: Do the authors clearly state the need for this module and who the target audience is?
- [x] Installation instructions: Is there a clearly stated list of dependencies?
- [x] Usage: Does the documentation explain how someone would adopt the module, and include examples of how to use it?
- [x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the module 2) Report issues or problems with the module 3) Seek support
Pedagogy / Instructional design (Work-in-progress: reviewers, please comment!)
- [x] Learning objectives: Does the module make the learning objectives plainly clear? (We don't require explicitly written learning objectives; only that they be evident from content and design.)
- [x] Content scope and length: Is the content substantial for learning a given topic? Is the length of the module appropriate?
- [x] Pedagogy: Does the module seem easy to follow? Does it observe guidance on cognitive load? (working memory limits of 7 +/- 2 chunks of information)
- [x] Content quality: Is the writing of good quality, concise, engaging? Are the code components well crafted? Does the module seem complete?
- [x] Instructional design: Is the instructional design deliberate and apparent? For example, exploit worked-example effects; effective multi-media use; low extraneous cognitive load.
JOSE paper
- [x] Authors: Does the
paper.mdfile include a list of authors with their affiliations? - [x] A statement of need: Does the paper clearly state the need for this module and who the target audience is?
- [x] Description: Does the paper describe the learning materials and sequence?
- [x] Does it describe how it has been used in the classroom or other settings, and how someone might adopt it?
- [x] Could someone else teach with this module, given the right expertise?
- [x] Does the paper tell the "story" of how the authors came to develop it, or what their expertise is?
- [x] References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?
@editorialbot commands
Hello @acocac, here are the things you can ask me to do:
# List all available commands
@editorialbot commands
# Add to this issue's reviewers list
@editorialbot add @username as reviewer
# Remove from this issue's reviewers list
@editorialbot remove @username from reviewers
# Get a list of all editors's GitHub handles
@editorialbot list editors
# Assign a user as the editor of this submission
@editorialbot assign @username as editor
# Remove the editor assigned to this submission
@editorialbot remove editor
# Remind an author, a reviewer or the editor to return to a review after a
# certain period of time (supported units days and weeks)
@editorialbot remind @reviewer in 2 weeks
# Adds a checklist for the reviewer using this command
@editorialbot generate my checklist
# Set a value for version
@editorialbot set v1.0.0 as version
# Set a value for branch
@editorialbot set jose-paper as branch
# Set a value for repository
@editorialbot set https://github.com/organization/repo as repository
# Set a value for the archive DOI
@editorialbot set 10.5281/zenodo.6861996 as archive
# Mention the EiCs for the correct track
@editorialbot ping track-eic
# Run checks and provide information on the repository and the paper file
@editorialbot check repository
# Check the references of the paper for missing DOIs
@editorialbot check references
# Generates the pdf paper
@editorialbot generate pdf
# Recommends the submission for acceptance
@editorialbot recommend-accept
# Generates a LaTeX preprint file
@editorialbot generate preprint
# Flag submission with questionable scope
@editorialbot query scope
# Get a link to the complete list of reviewers
@editorialbot list reviewers
# Creates a post-review checklist with editor and authors tasks
@editorialbot create post-review checklist
# Open the review issue
@editorialbot start review
@editorialbot remind @brian-rose in 1 day
Reminder set for @brian-rose in 1 day
Hi @aulemahal, thanks for going through the review check list. May I ask if you can expand further (add comments) of the unchecked items, particularly in the pedagogy section? This info might be helpful for the authors to implement changes if necessary. Thank you.
Hi! My review is still in progress. I have gone through the 3 first chapters of the document, I plan to finish the review this week. Sorry for the delay!
Hi! My review is still in progress. I have gone through the 3 first chapters of the document, I plan to finish the review this week. Sorry for the delay!
Thanks for the update 🙏
:wave: @brian-rose, please update us on how your review is going (this is an automated reminder).
Review checklist for @brian-rose
Conflict of interest
- [x] As the reviewer I confirm that I have read the JOSE conflict of interest policy and that there are no conflicts of interest for me to review this work.
Code of Conduct
- [x] I confirm that I read and will adhere to the JOSE code of conduct.
General checks
- [x] Repository: Is the source for this learning module available at the https://github.com/mikemorris12/UTCDW_Guidebook/?
- [x] License: Does the repository contain a plain-text LICENSE file with the contents of a standard license? (OSI-approved for code, Creative Commons for content)
- [x] Version: Does the release version given match the repository release?
- [x] Authorship: Has the submitting author (@mikemorris12) made visible contributions to the module? Does the full list of authors seem appropriate and complete?
Documentation
- [x] A statement of need: Do the authors clearly state the need for this module and who the target audience is?
- [x] Installation instructions: Is there a clearly stated list of dependencies?
- [x] Usage: Does the documentation explain how someone would adopt the module, and include examples of how to use it?
- [x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the module 2) Report issues or problems with the module 3) Seek support
Pedagogy / Instructional design (Work-in-progress: reviewers, please comment!)
- [x] Learning objectives: Does the module make the learning objectives plainly clear? (We don't require explicitly written learning objectives; only that they be evident from content and design.)
- [x] Content scope and length: Is the content substantial for learning a given topic? Is the length of the module appropriate?
- [x] Pedagogy: Does the module seem easy to follow? Does it observe guidance on cognitive load? (working memory limits of 7 +/- 2 chunks of information)
- [x] Content quality: Is the writing of good quality, concise, engaging? Are the code components well crafted? Does the module seem complete?
- [x] Instructional design: Is the instructional design deliberate and apparent? For example, exploit worked-example effects; effective multi-media use; low extraneous cognitive load.
JOSE paper
- [x] Authors: Does the
paper.mdfile include a list of authors with their affiliations? - [x] A statement of need: Does the paper clearly state the need for this module and who the target audience is?
- [x] Description: Does the paper describe the learning materials and sequence?
- [x] Does it describe how it has been used in the classroom or other settings, and how someone might adopt it?
- [x] Could someone else teach with this module, given the right expertise?
- [x] Does the paper tell the "story" of how the authors came to develop it, or what their expertise is?
- [x] References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?
A couple of quick issues I noted in a first pass, before taking a deeper dive into the content:
- The repository does not have any releases, so it does not match the given v1.0 version number.
- One of the references in the JOSE paper is to the documentation for the xclim package. It's a valid link but not a DOI. The suggestions within the xclim docs is to cite their JOSS paper (https://doi.org/10.21105/joss.05415). I think that would be a better choice.
I have finished going through the guidebook!
From the review checklist, the only remaining "issue" is indeed that there seems to be no versioning on the guidebook repo. Otherwise, I find this document to be quite interesting. It is well written and complete. It is long, but (almost) no parts seemed superfluous. The topic is complex and to learn both the theory and the practical stuff, a user needs to take their time going though the book.
The following comments are not really publication-blocking to me.
The only section I found tedious was the data download. Station stuff is concise enough with ec3 and the usage of siphon on PAVICS is clear too. However, all those custom functions with request for the other dataset make the code look overly complex (too me). I would suggest the following ideas to make this part less cumbersome:
- Use
intake-esmwith Google's CMIP6 catalog. See this doc - Redirect the readers to the HTML interfaces of ESGF and quickly show how to download the netCDFs by hand or by generated
wgetscripts (or even globus).
All my other comments are on details of the python code. Should I put them elsewhere ? As issues in the original repo ?
- I think there is too much usage of
pandas. Xarray and Pandas have very similar but different APIs, as to not confuse the readers, I would suggest showing early in the document how to transformDataFramesintoDatasetscleanly and only computing and plotting with xarray from then on. - Similarly,
dask.array.apply_along_axisis used in many places. I would suggest usingxr.apply_ufuncinstead (withvectorize=True). Also, the first place it is used (section 3.2), the data isn't even in a dask array. - OpenDAP : It is mentionned that OpenDAP and PAVICS's THREDDS server are limited when it comes to large datasets. That's only true because no chunking is being given to
xr.open_dataset. PAVICS suggests some chunking for its principal datasets and for others,chunks='auto'might be ok for what this guidebook tries to show. - For the observations, you could explore using the AHCCD dataset on PAVICS instead of downloading raw station data.
- However, this previous suggestion would require some quick explanation on how dask works and how to configure it. Showing to do
dask.config.set(num_workers=XX)might save some headaches and machine jamming. A bonus section or links to tutorials about the distributedClient(and its dashboard!) might be useful. - PCIC's download URL is down. Their website has been reorganized and the data is now accessible through an interactive portal (which resonates with my previous point).
xclim.ensembles.create_ensemblecan merge datasets along dimensions liketime, the input arguments simply need to be given appropriately.- Please use
xclim.atmosinstead ofxclim.indices. - Some cells have long content. Consider making multiline calls, as when reading the generated HTML, it is sometimes nicer if the user doesn't have to scroll left and right.
Small grammatical fix in the JOSE paper text: in the Guidebook Content section, line 51 of the rendered PDF. Replace
Chapter 2 explains how climate change projections are made and what their limitations.
with
Chapter 2 explains how climate change projections are made and what their limitations are.
I have finished going through the entire guidebook.
Overall I found the material to be well-organized and readable. The scope seems reasonable relative to the stated needs and audience. It's not a short read! But seems like a valuable teaching resource. Nice work!
Although complete reproducibility of the code examples doesn't seem to be in the list of review requirements, I thought it would be useful to take this opportunity to run through all the content and flag anything that needs updating. I opened a few issues (linked above) describing the problems I encountered. At least one of these issues (with the PCIC data access) was already flagged by @aulemahal. The issue with unit errors in xlim is causing failures in several notebooks.
I tend to agree with @aulemahal about the rather cumbersome data downloads. There is an actively maintained tutorial on accessing CMIP6 data on Google cloud with intake-esm in the Project Pythia CMIP6 Cookbook, perhaps this is helpful.
In terms of checklist items, I left unchecked "Version", "Installation Instructions" and "References" for reasons described above.
Thanks @brian-rose and @aulemahal for your helpful feedback! Apologies for all of the Python bugs, we've had quite a few users run the code without error so I wonder if there are package version issues that need to be sorted - I'll look into it. @acocac how do we proceed from here? I'm not used to this sort of open review process, so do I start making the revisions right away or will you write an editor's decision like a traditional journal? I'll wait to hear from you before I proceed.
Thanks @brian-rose and @aulemahal for your helpful feedback! Apologies for all of the Python bugs, we've had quite a few users run the code without error so I wonder if there are package version issues that need to be sorted - I'll look into it. @acocac how do we proceed from here? I'm not used to this sort of open review process, so do I start making the revisions right away or will you write an editor's decision like a traditional journal? I'll wait to hear from you before I proceed.
Hi @mikemorris12, following the review process in the Submitting a paper to JOSE section, you should:
- Implement changes in the JOSE-submission branch.
- Respond to reviewer-raised issues (if any are raised) on the submission repository’s issue tracker. Reviewer and editor contributions, like any other contributions, should be acknowledged in the repository i.e. tag @[reviewer-github-handle] in commit messages.
- Upon successful completion of the review, merge the JOSE-submission branch into master and release a new version. I suggest Version 1.1.0 due to the minor nature of the comments (e.g. improve the data download using
intake-esm). Please refer to semantic versioning 2.0.0 for further info. - Remember to deposit a copy of your (updated) repository with a data-archiving service such as Zenodo or figshare, get a DOI for the archive, and update the review issue thread with the DOI.
- Tag me when you complete above steps. Then, I'll suggest the reviewers to revisit their checklists.
Hi @acocac, I think I've addressed each of the reviewer’s comments. Here is a line-by-line response:
For @aulemahal ‘s comments:
- Data download: Instead of replacing the
esgf-pyclientand Google Cloud data access tutorials, I’ve added a third one (Section 3.6) which shows users how to useintake-esmto access data. I’ve also mentioned in the preamble of Section 3.4 (theesgf-pyclientand model analysis tutorial) that these other options exist and are easier to use. - Section 3.4 links to the ESGF web interface for searching for data.
- Use of pandas: I agree that
xarrayis the standard (and superior) package for working with climate data. However, many new users are experienced working with tabular data but have no experience withxarray, which is why we started the tutorial using pandas to manipulate the station data. - All instances of
dask.array.apply_along_axishave been replaced withxr.apply_ufunc. - I’ve added mention in Section 3.2 that using
chunks=’auto’allows the user to load larger amounts of data from PAVICS via OpenDAP. - I’ve added instructions on accessing the AHCCD data to Section 3.1.
- Section 3.2 links to the
xarrayanddaskdocumentation on parallelization. - I’ve changed the PCIC data URL to the new one.
- I tried using
xclim.ensembles.create_ensembleto merge two datasets along thetimedimension, but I couldn’t get it to work properly. The resulting dataset had the wrong time values for the latter half of the record. For now, I’ve mentioned that it’s possible, but I can’t add instructions on how to do it since I couldn’t get it to work. - All uses of
xclim.indiceshave been switched toxclim.indicators.atmos. - I’ve shortened many lines which were previously too long and required scrolling.
For @brian-rose ‘s comments:
- I’ve fixed the grammatical error and the
xclimcitation in the manuscript. - I’ve responded to the issues opened on the UTCDW Guidebook repository; they’ve all been closed.
- I’ve added Section 3.6 which shows users how to access CMIP6 data using
intake-esm.
A question for @acocac is when exactly I should merge the changes into master, create the new release, and deposit the repo in Zenodo. You said, “tag me when you complete the above steps”, but you also said that branch merge and the Zenodo deposit should take place after successful completion of the review. I assume that the reviewers need to revisit their checklists before the review is considered complete. Please advise on how to proceed.
A question for @acocac is when exactly I should merge the changes into master, create the new release, and deposit the repo in Zenodo. You said, “tag me when you complete the above steps”, but you also said that branch merge and the Zenodo deposit should take place after successful completion of the review. I assume that the reviewers need to revisit their checklists before the review is considered complete. Please advise on how to proceed.
@mikemorris12 I'll ask the reviewers to confirm they are satisfied and recommend publication. Then, you should complete the final steps incl. branch merge.
@brian-rose @aulemahal - can you confirm if the above comments/implementations by the author are ok for you? Would you recommend the work for publication?
Thank you!
@editorialbot generate pdf
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@acocac I confirm that I am satisfied with the author's revisions, and I recommend this work for publication in JOSE!
@acocac I also confirm that I am satisfied. And I recommend this work for publication in JOSE.