iwxxm icon indicating copy to clipboard operation
iwxxm copied to clipboard

Duplication of IWXXM code lists currently under 306, 49-2, common and bufr4/codeflag and be included under iwxxm

Open blchoy opened this issue 7 months ago • 9 comments

Details

IWXXM code lists are previously published under the following URIs:

  1. http://codes.wmo.int/306/4678... - The contents are coming from Table 4678 of the Manual on Codes, WMO No.306 Vol I.1, Part A - Alphanumeric Codes
  2. http://codes.wmo.int/49-2... - The contents are coming from the Technical Regulations, Basic Documents No. 2, WMO No.49, Vol II - Meteorological Service for International Air Navigation
  3. http://codes.com.int/common/nil... - The contents are coming from Code Table D-1, the Manual on Codes, WMO No.306 Vol I.2, Part B and Part C
  4. http://codes.com.int/bufr4/codeflag/... - The contents are coming from BUFR4 Code and Flag tables, the Manual on Codes, WMO No.306 Vol I.2

However, documents in (1) and (2) above are frozen and should no longer be referenced. There is also an anticipation to update and expand the entities in (3) and (4) to better meet the needs to support the use of meteorological information in air navigation.

It is therefore proposed to make a fork of the code lists under 306, 49-2, common and bufr4/codeflag and put them under iwxxm, and make the latest version of IWXXM to accept the use of both of them until the date those code lists under 306, 49-2, common and bufr4/codeflag can be deprecated.

Requestor

BL Choy, @blchoy

blchoy avatar May 14 '25 07:05 blchoy

To facilitate discussion in the FT2025-2RC2 Branch of my fork (see https://github.com/blchoy/iwxxm-codelists/tree/FT2025-2RC2/) I have copied the code lists under 306, 49-2 and common and have them properly modified for inclusion under iwxxm. The generated TTL and RDF files can also be found under the same Branch.

Subject to the views of the team I shall update the schematron rules to allow checking of an entity in either 306, 49-2 and common or iwxxm. An announcement on the transition and potential deprecation of 306, 49-2 and common should also be included.

blchoy avatar May 14 '25 07:05 blchoy

https://github.com/wmo-im/iwxxm/wiki/TT-AvData-Discussion-2025-May-14 notes: iwxxm-codelists repo, Choy updated the codelists and revised the scripts; team is invited to review the generated RDF files; to be added with FT2025-2 but not make them required for the next set of Amendments to Annex 3 (Amendment 83 in two years). Jan doesn't see the advantage of allowing iwxxm and old namespaces in the 2025-2 release; Choy mentioned that there are no requirements from Annex 3 related to codes and we need to consider the package versions (see compatibility table);

Choy will work on the development of the iwxxm codes and the team can consider the potential plan to release codelists under the iwxxm namespace/register, but not require it for the 2025-1 release

amilan17 avatar May 14 '25 12:05 amilan17

I have updated the iwxxm-codelists 2025-2RC2 branch with a revised script to generate all TTL and RDF files under 306, 39-2, common and bufr4/codeflag as well as consolidating them into iwxxm and the resulting files. There are certain things worth considering:

  1. In the CSV files, and hence the TTL files, what columns should be included? When downloading from the WMO Codes Registry there are associated metadata which seems to be used internally and should not be uploaded to the registry during updating.

  2. I have included applicable IWXXM versions to each code list as well as individual entity and the schematron rules have been updated to also check if the code list entry in use in the IWXXM report is relevant for the IWXXM version of the report. Having said that, the use of owl:versionInfo and the way it is represented in RDF files need confirmation.

  3. When consolidating bufr4/codeflag into iwxxm, it was noticed that we may need to change the name of the Code and Flag tables from X-XX-XXX to more verbose descriptions. Currently I am using the following mapping:

    • http://codes.wmo.int/bufr4/codeflag/0-11-030 -> http://codes.wmo.int/iwxxm/ExtendedDegreeOfTurbulence
    • http://codes.wmo.int/bufr4/codeflag/0-20-008 -> http://codes.wmo.int/iwxxm/CloudDistributionForAviation
    • http://codes.wmo.int/bufr4/codeflag/0-20-012 -> http://codes.wmo.int/iwxxm/CloudType
    • http://codes.wmo.int/bufr4/codeflag/0-20-041 -> http://codes.wmo.int/iwxxm/AirframeIcing
    • http://codes.wmo.int/bufr4/codeflag/0-22-061 -> http://codes.wmo.int/iwxxm/StateOfTheSea
  4. Having said that, I am still using numbers to represent individual entity, as it is difficult to find out abbreviations to replace them. Ditto the removal of the last entry of each codelist as they are all being used for missing values which is now being replaced by nilReason in IWXXM. More discussion would be needed for this.

Based on (4) above, I suspect publication of 2025-2 may not be the right time to do the migration, but to enable the use of new consolidated iwxxm codelists to allow people to have more time to do the transition. More discussion on this is anticipated.

blchoy avatar Jun 04 '25 09:06 blchoy

https://github.com/wmo-im/iwxxm/wiki/TT-AvData-Discussion-2025-Jun-4 notes: some of the issues to work through

  1. the CSV files have different numbers of columns
  2. rename bufr code tables from numbers to names

@tt-avdata please review the CSVs

The iwxxm folder is a consolidation of the 306, bufr codeflags, 49-2, and common tables which are currently in use and it includes a couple new values

The team decided to defer the publication of the consolidated current code lists to a later time rather than pushing forward with FT2025-2 in order to have more time to work through the issues

@amilan17 review CSV columns

amilan17 avatar Jun 04 '25 12:06 amilan17

To help @amilan17 and all to figure out what should be included as columns in the CSV files, the following is a summary of what the columns are currently:

  1. For the container CSV (those file names ending with _container.csv), the columns include:

    Column name in GitHub CSV file Column name in WMO Codes Registry exported CSV file Element name used in exported TTL and/or RDF/XML files Exported in TTL Exported in RDF/XML
    id @ id reg:Register/@ rdf:about (RDF/XML only) Y Y
    notation @ notation skos:notation Y Y
    status @ status reg:status Y N
    description dct:description dct:description Y Y
    label rdfs:label rdfs:label Y Y
    altLabel skos:altLabel skos:altLabel Y Y
    modified dct:modified dct:modified Y Y
    source dc:source dc:source Y N
    seeAlso rdfs:seeAlso rdfs:seeAlso Y N
    publisher dct:publisher dct:publisher Y N
    manager reg:manager reg:manager Y N
    owner reg:owner reg:owner Y N
    note skos:note skos:note Y Y
    related - - N N
    iwxxmVersion - owl:versionInfo Y Y
    - ldp:hasMemberRelation - N N
    - ldp:membershipPredicate - N N
    - owl:versionInfo - N N
    - rdf:type - N N
    - reg:containedItemClass - N N
    - reg:subregister - N N
  2. For the entity CSV (those file names ending with _entity.csv), the columns include:

    Column name in GitHub CSV file Column name in WMO Codes Registry exported CSV file Element name used in exported TTL and/or RDF/XML files Exported in TTL Exported in RDF/XML
    id @ id skos:Concept/@ rdf:about (RDF/XML only) Y Y
    notation @ notation skos:notation Y Y
    status @ status reg:status Y N
    description dct:description dct:description Y Y
    label rdfs:label rdfs:label Y Y
    altLabel skos:altLabel skos:altLabel Y Y
    source dc:source dc:source Y N
    seeAlso rdfs:seeAlso rdfs:seeAlso Y N
    related - - N N
    note skos:note skos:note Y Y
    iwxxmVersion - - Y Y
    - rdf:type - N N

Care should be taken that:

a. The columns in the CSV files may or may not be those required to update the WMO Codes Registry.

b. It seems that the WMO Codes Registry is keeping its own record version and exposed as own:versionInfo. The introduction of indicators showing applicable IWXXM versions to a codelist and a codelist entry using also owl:versionInfo may interfere with the registry's operation. We will have to quickly confirm whether this is true or not and make a decision to use another element if necessary.

c. As discussed in the teleconference the codelists are essential flat. There may be references to other codelists and their entries to indicate where a codelist and its entries are originated from but this is just one off and there should be no need to keep updating these relationships.

blchoy avatar Jun 05 '25 06:06 blchoy

b. It seems that the WMO Codes Registry is keeping its own record version and exposed as own:versionInfo. The introduction of indicators showing applicable IWXXM versions to a codelist and a codelist entry using also owl:versionInfo may interfere with the registry's operation. We will have to quickly confirm whether this is true or not and make a decision to use another element if necessary.

It's possible to add owl:version included as an attribute of the code, but not for a registry

amilan17 avatar Jun 11 '25 10:06 amilan17

I think we only need notation, label, description and version. We may want to add related for linking the old codes.

Column name in GitHub CSV file Column name in WMO Codes Registry exported CSV file Element name used in exported TTL and/or RDF/XML files Required for upload of TTL? Comments
id @ id skos:Concept/@ rdf:about (RDF/XML only) N built from notation and URL
notation @ notation skos:notation Y
status @ status reg:status N I don't think it's necessary to have this as an attribute of the code because it is set in the application
description dct:description dct:description Y
label rdfs:label rdfs:label Y
altLabel skos:altLabel skos:altLabel N
source dc:source dc:source N
seeAlso rdfs:seeAlso rdfs:seeAlso N
related - -
note skos:note skos:note N N
iwxxmVersion - - Y not required, but useful information
- rdf:type - Y not in CSV columns, but codes are "concepts" and code lists are " Container, Collection, and Register"

amilan17 avatar Jun 11 '25 10:06 amilan17

Thank you @amilan17. My views are as follow:

  1. id

    Column name in GitHub CSV file Column name in WMO Codes Registry exported CSV file Element name used in exported TTL and/or RDF/XML files Required for upload of TTL? Comments
    id @ id skos:Concept/@ rdf:about (RDF/XML only) N built from notation and URL

    I think we shall keep this column since:

    • This URL + notation is exactly what is being download from the WMO Codes Registry
    • We have no where to store the URL in the CSV file. Storing it as part of the name of a CSV file? I am afraid this is not practical as a URL contains slashes
  2. status

    Column name in GitHub CSV file Column name in WMO Codes Registry exported CSV file Element name used in exported TTL and/or RDF/XML files Required for upload of TTL? Comments
    status @ status reg:status N I don't think it's necessary to have this as an attribute of the code because it is set in the application

    I think we shall keep this column since:

    • If the CSV files are the official place to store the codelists (c.f. BUFR tables in Manual on Code), then we need to properly annotate them. If there is any change of them, we should change from the source (i.e. CSV files) and upload to the registry again rather than doing it through the application. Only this kind of workflow can ensure consistency between the official record and the operational registry
    • Is the following TTL file being able to set 'status' and others on the WMO Codes Registry?
    <http://codes.wmo.int/49-2/AerodromePresentOrForecastWeather> a reg:Register, skos:Collection, ldp:Container ;
        rdfs:label "Code Table D-7: Aerodrome present or forecast weather"@en ;
        dct:description "blablabla."@en ;
        dct:modified "2014-09-03T09:52:40.633000+00:00"^^xsd:dateTime ;
        dct:publisher <http://codes.wmo.int/system/organization/wmo> ;
        reg:manager <http://codes.wmo.int/system/organization/www-dm> ;
        reg:owner <http://codes.wmo.int/system/organization/wmo> ;
        reg:status "stable" ;
        owl:versionInfo <http://icao.int/iwxxm/2025-2>,
        skos:notation "AerodromePresentOrForecastWeather" .
    
  3. Other information

    Column name in GitHub CSV file Column name in WMO Codes Registry exported CSV file Element name used in exported TTL and/or RDF/XML files Required for upload of TTL? Comments
    altLabel skos:altLabel skos:altLabel N
    source dc:source dc:source N
    seeAlso rdfs:seeAlso rdfs:seeAlso N
    related - -
    note skos:note skos:note N N

    I don't have a clue to or not to keep these columns, as they come from different codelists. Removing any of them will change the content of existing codelists on the WMO Codes Registry. A pragmatic way to move forward is to keep them, until we have time and effort to tidy up the CSV files.

  4. type

    Column name in GitHub CSV file Column name in WMO Codes Registry exported CSV file Element name used in exported TTL and/or RDF/XML files Required for upload of TTL? Comments
    - rdf:type - Y not in CSV columns, but codes are "concepts" and code lists are " Container, Collection, and Register"

    rdf:type is hard coded into the script that generates TTL and RDF files and so not needed in the CSV file, unless there are situations requiring it to be specified in the CSV file.

blchoy avatar Jun 11 '25 14:06 blchoy

https://github.com/wmo-im/iwxxm/wiki/TT-AvData-Discussion-2025-Jun-13 notes: @amilan17 look at altLabel and source attributes-- what is it for and do we need it?

@wmo-im/tt-avdata review PR https://github.com/wmo-im/iwxxm-codelists/pull/20

amilan17 avatar Jun 13 '25 13:06 amilan17

Implemented in IWXXM 2025-2.

blchoy avatar Nov 21 '25 16:11 blchoy