cmip6-cmor-tables
cmip6-cmor-tables copied to clipboard
Need for maintenance branches of CMIP6-cmor-tables (one per DR version)
In issue #139 , we have an example of a change in CMIP6-CMOR-Tables which is not linked to a change in the Data Request version
(D.Nadeau) .... You will need to add "int_missing_value": "-999", in the Header section of CMIP6_Ofx.json and that should solve the missing_value issue, but you will have a flag_value issue. I will update the tables to reflect this change soon.
Because CMIP6 participants will produce results that conforms to various versions of the DR, and want to publish it using up-to-date tables, this raises the need for creating maintenance branches of the tables, one per Data Request version, beginning with version 01.00.20. The maintenance will consist in applying to these branches all fixes and changes occurring in the tables except those generated by the DR version changes
@taylor13 Should I recreate the 01.00.20 tables with int_missing_value
and merge to master. The latest tables won't be 01.00.21
though and people will have to checkout the tags.
No, let's not do anything about 01.00.20 tables right now. As I understand it, the new version labeling approach outlined at https://goo.gl/86C4ZB assigns versions/tags to the CMOR/PrePARE code that are separate from the table version/tag/labels, so we users can retrieve tables separately from the code. Will that help here?
@senesis I would like to close this issue. Any objections?
Not sure that everything aspect been considered : while I understand that there is an agreement for managing a series of CMOR-tables maintenance branches, one per version of the DR, we, using version 01.00.21, are faced with the dilemma of either : i) sticking to the raw version or ii) using a version which takes into account some fixes to the DR that we were able to introduce in our production workflow. For instance, in DR01.00.21, mlotst has a wrong cell_measures (areacella), we detected that, and we produce datafiles with correct cell_measures (areacello). But not all centers necessarily introduced the same fixes.
Any idea on how to manage that issue ? (which can occur for various centers and various DR versions)
@senesis I really don't know how to manage. I think IPSL and CNRM (and possibly GFDL) are in the unique position of relying on a quite old version of the data request.
For this particular issue, did you find a work around, or are you waiting on us to do something?
As far as I know, GFDL now uses version based on DR01.00.27; so IPSL and CNRM-CERFACS are the only centers using the tables version based on DR01.00.21
In that case, and because IPSL and CNRM use the same changes in those tables, I suggest that these jointly changed tables become the reference version in the repository which is reached by the publisher (and possibly also for other uses as e.g. by any party which would like to enforce QC on the data, as e.g. for DOI related QC ...)
However, the amount of changes depends on the version of PrePARE, because the version of PrePARE which is now distributed (with CMOR 3.3) has shortcomings that necessitated a number of table changes which become useless with some code changes already available from some developers
It would be possible to back-tag versions using some of the code buried in https://github.com/WCRP-CMIP/CMIP6_CVs/blob/master/src/cleanupTags.py. Having said that, I believe this is a very old, and now very deprecated issue, so will close - if I have missed something, please reopen
I agree this is quite an old issue and that it should be closed.
However, it carries once again the lesson that having redundant forms of information (here the DR versions and their cmor-tables translation) is hardly tractable. Shared tools (such as Prepare, and possibly CMOR) should rely on the upstream source, namely the DR and its API
@senesis totally agree that we need to remove sources of duplicate information and streamline connections between sources. A challenge that you may have not considered is that many more projects than just the CMIP phases are now using the *mip-cmor-tables, which raises the question of how the DR and the *mip-cmor-tables are interrelated.
We have started work in the mip-cmor-tables to remedy this issue, with the intention of collapsing input4MIPs-cmor-tables and obs4MIPs-cmor-tables into this composite repo, and will continue to work with others to continue to reduce (and remove) duplication across the infrastructure stack. This would then logically lead to the DR becoming a downstream dependent of these sources, dramatically simplifying the role of the DR stack, as this would just then be a tool that provides connections (lists) and linkages between MIP experiments and the variables that are contained in the mip-cmor-tables
In my view a single repository of variables (clearly and unambiguously defined with all required metadata) can serve everyone, and this repository doesn't need to be "hosted" by the data request or necessarily in CMOR tables. Rather the data request can draw on the variables in the repository and the CMOR tables can organize the variables as makes sense into groups convenient to those using CMOR and PrePARE. Other parts of the infrastructure can access the same repository as needed and they don't have to use the data request API.
Of course the variable repository will not have any information about which variables in it should be reported for which experiments.