xsdata icon indicating copy to clipboard operation
xsdata copied to clipboard

xsd:language to enumeration RFC 3066

Open skinkie opened this issue 2 years ago • 3 comments

This is probably a hygiene thing (low priority), but I think on the other hand it is also quite trivial. Ideally xsd:language would be an enumeration of RFC 3066. But the devil is in the details with IANA and unregistered languages. I would prefer to be able to specify while generating the schema that the value would be actually limited to RFC 3066 and not a too broad str.

https://github.com/tefra/xsdata/blob/master/xsdata/models/enums.py#L127

skinkie avatar Jun 16 '23 09:06 skinkie

The official xml schema defines the xs:language like this, if we were to generate an enumeration, I would have to ship a custom version in order to implicit specify all the RFC 3066 enumeration members

  <xs:simpleType name="language" id="language">
    <xs:annotation>
      <xs:documentation
        source="http://www.w3.org/TR/xmlschema-2/#language"/>
    </xs:annotation>
    <xs:restriction base="xs:token">
      <xs:pattern
        value="[a-zA-Z]{1,8}(-[a-zA-Z0-9]{1,8})*"
                id="language.pattern">
        <xs:annotation>
          <xs:documentation
                source="http://www.ietf.org/rfc/rfc3066.txt">
            pattern specifies the content of section 2.12 of XML 1.0e2
            and RFC 3066 (Revised version of RFC 1766).
          </xs:documentation>
        </xs:annotation>
      </xs:pattern>
    </xs:restriction>
  </xs:simpleType>

tefra avatar Jun 25 '23 06:06 tefra

@tefra would you prefer then to plug-in a restriction/enumeration?

skinkie avatar Jun 25 '23 11:06 skinkie

Is there any official enumeration anywhere?

If we do this most likely I am going to add a custom schema to auto generate the enumeration, but then do we go with RFC 3066 or 4646 or 5646?

tefra avatar Jul 16 '23 16:07 tefra

The official schema didn't provide the enumeration, I don't see why xsdata should.

tefra avatar Mar 09 '24 18:03 tefra