IATI-Schemas
IATI-Schemas copied to clipboard
Why aren't codelists specified as XMLSchema enumeration restrictions?
I searched the repo, but haven't seen this discussed before.
I see there's a mapping file between fields and codelists, but – for complete codelists at least – wouldn't it be possible to restrict the possible values using plain XMLSchema?
For example, the EU has a currency codelist file t_currency_publicProcurement.xsd that starts as:
<?xml version="1.0" encoding="utf-8"?>
<xs:schema xmlns:cur="http://publications.europa.eu/resource/authority/currency" xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://publications.europa.eu/resource/authority/currency" elementFormDefault="qualified" attributeFormDefault="unqualified" version="20171020-0">
<xs:simpleType name="t_currency_tedschema_ted">
<xs:restriction base="xs:string">
<xs:enumeration value="AED">
<xs:annotation>
<xs:documentation>UAE dirham UAE dirhams / adm.status[current]</xs:documentation>
</xs:annotation>
</xs:enumeration>
<xs:enumeration value="AFN">
<xs:annotation>
<xs:documentation>Afghani / adm.status[current]</xs:documentation>
</xs:annotation>
</xs:enumeration>
This is then referred to by the type attribute of xsd:attribute elements.
The XMLSchema codelist files can be auto-generated from the current IATI codelist files (to avoid breaking changes).
This would allow XMLSchema validators to work out of the box, without having to implement custom logic to understand how to validate fields that use codelists.
Seems worth tagging @davidmegginson and @bjwebb here, since I guess this was an early design decision.
Thinking back 9 years, I'm pretty sure it was because schemas and vocabularies are maintained separately. The schema can change without a change in the vocabularies, and vice-versa, so we didn't want to hard-code the vocabularies into the schemas. Also, initially many (most?) of the vocabularies were maintained by the OECD DAC people, not IATI, and a good number still have outside maintainers (for example, OCHA controls some of the humanitarian vocabularies).
Using XSD was a hard decision for other reasons. RelaxNG is a much-more flexible schema language, which would allow us to have elements until iati-activity appear in any order -- it seemed stupid to force a spurious validation error on an XML dataset just because someone put sector and transaction in the wrong (arbitrarily-chosen) order. But we knew that XSD had much-wider tool support, especially in commercial products, and didn't want to throw up a different barrier for users.
Makes sense!
In terms of versioning schemas and codelists separately: The vocabularies can be xsd:import'ed into the schema (as they are in the EU example). To not fix a given version of the schema to a specific version of a given vocabulary, the vocabulary URL can redirect to the latest allowed version for that schema.
Anyhow, I don't know if this is something worth considering for future versions of IATI, or if the status quo is good enough and an infrequent pain-point.