Dataref using @name is not displayed in HTML
Compare the HTML displayed for the attributes @ident and @usage on element <language> (http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-language.html) . The datatype of the former contains a <dataRef which specifies a TEI datatype. The latter however contains one that specifies an XSD datatype, using @name. The former gets a helpful link with the name of the TEI datatype. The latter gets NOTHING. This is not helpful.
Although this is definitely a bug that should be fixed, further investigation shows that we could circumvent it in the GL without much effort, and at the same time remove some needless complexity. There are seven cases in P5 where an attribute's datatype is supplied by a dataRef/@name rather than by a dataRef/@key, as follows:
- att.global.xml:id is defined as @name=ID
- language@usage is defined as @name=nonNegativeInteger
- att.metrical.met, .real, and .rhyme are all defined as @name=token
- dataFacet@value is defined as @name=string
- att.deprecated.validUntil is defined as @name=date
Taking these in turn...
- There is no TEI datatype equivalent to ID. We could add one.
- TEI datatype teidata.count is defined as xsd:nonNegativeInteger (without restriction) so there seems no reason to use @name="nonNegativeInteger".
- TEI datatype teidata.pattern is defined as xsd:token (without restriction) so there seems no reason to use @name="token" in the absence of a restriction
- TEI datatypes teidata.key and teidata.text are both defined as xsd:string (without restriction), which suggests that we need to choose between them
- TEI datatype teidata.temporal.iso and teidata.temporal.w3c are both defined as xsd:date amongst other things. We need to decide whether we really want to make @validUntil use only a subset of the values permitted for e.g. @when.
I like this approach; it's more Pure ODD-like. The next question then would be whether @name is needed at all, or whether we should force the use of TEI datatypes. That would require that we provide a TEI equivalent to all XSD datatypes, presumably; are we missing any at this point?
I've added what I think is a fix for this in commit 51bf917 and the preceding one.
@lb42, can you confirm that what you're seeing now is what you think we should see?
http://teijenkins.hcmc.uvic.ca/job/TEIP5-dev/lastSuccessfulBuild/artifact/P5/release/doc/tei-p5-doc/en/html/ref-att.global.html http://teijenkins.hcmc.uvic.ca/job/TEIP5-dev/lastSuccessfulBuild/artifact/P5/release/doc/tei-p5-doc/en/html/ref-language.html http://teijenkins.hcmc.uvic.ca/job/TEIP5-dev/lastSuccessfulBuild/artifact/P5/release/doc/tei-p5-doc/en/html/ref-att.metrical.html
If so, I think the next stage on this ticket would be to PURE-ify them all as you suggest above.
I wonder if it wouldn't be more helpful to put an XSD: in front of the name? But if we purify them all, it hardly matters. I suggest Council debates whether or not it wants to keep at least one of these declarations (ID seems the obvious candidate) using the @name option, if only to ensure that that continues to work properly. The others could all be changed as suggested above. But we don't want to shut the door completely to people using XSD declarations: indeed, if they need <dataFacet> we say they are required to do so.
In the PDF, I link them to the XML schema datatypes; we could do that in the Guidelines output too. But don't we also have the issue of not being entirely sure which version of the XS datatypes we mean?
I hadn't had a good look at dataFacet until you mentioned it above, but now I have, I wonder if there are any use-cases for it which couldn't be supported by regular expressions.
Can this just be closed?
@ebeshero and I confirm that whatever I thought I was fixing back then, it hasn't changed anything about the rendering of the spec page for the language element, so @lb42's original complaint is still valid.
Also the rendering of the att.deprecated tagdoc appropriately says the datatype is “date”, but it is a link to the <tei:date> element, not to something that explains the xsd:date datatype (or not a link, which is what I would have expected).
In part to play devil’s advocate, I am going to voice the argument against @lb42’s “more PureODD” approach, above.
Primarily the argument revolves around a principle TEI P5: if there is already a standard that does what we want, we use it rather than re-inventing the wheel. W3C already has a reasonable, standardized, widely promulgated system that XSD, RELAX NG, and Schematron validators know how to use.
For the particular cases raised (quoted in combined form first, then addressed in following list):
- att.global.xml:id is defined as
@name=ID: There is no TEI datatype equivalent to ID. We could add one. - language@usage is defined as
@name=nonNegativeInteger: teidata.count is defined as xsd:nonNegativeInteger (without restriction) so there seems no reason to use@name=nonNegativeInteger. - att.metrical.met, .real, and .rhyme are all defined as
@name=token: teidata.pattern is defined as xsd:token (without restriction) so there seems no reason to use@name=tokenin the absence of a restriction - dataFacet@value is defined as
@name=string: teidata.key and teidata.text are both defined as xsd:string (without restriction), which suggests that we need to choose between them - att.deprecated.validUntil is defined as
@name=date: teidata.temporal.iso and teidata.temporal.w3c are both defined as xsd:date amongst other things. We need to decide whether we really want to make@validUntiluse only a subset of the values permitted for e.g.@when.
(Thoughts from previous post summarized above, my thoughts below.)
- Well, yes, but doing so would add an unnecessary (and perhaps unhelpful) indirection. TEI is, at least for now, XML.
- Agreed. This is, IMHO, a corrigible error that should just be fixed.
- But the value of att.metrical.(met|real|rhyme) is not a regular expression. Thus the use of teidata.pattern would be incorrect, and very confusing.
- Oooh. This is really interesting, and probably requires discussion, if not debate. How on earth (and, for that matter, why on earth) are we planning to represent the enumeration facet in a string?
- We absolutely want the
@validUntilof att.deprecated to be an xsd:date. We do not have, and I daresay could not have, semantics for a@validUntil=--12-09. Whether to get there by using<dataRef name="date"/>,<dataRef name="xsd:date"/>, or by<dataRef key="teidata.temporal.w3c"/>and then use Schematron rules to say “no, you are not really allowed to use gYear, gMonth, gDay, gYearMonth, gMonthDay, time, or dateTime, we were just kidding” is an interesting question. (My instinct would be to vote for leaving it exactly as it is now, but I might be convinced otherwise.)
P.S. Most of this discussion should probably be on the TEI repo, not here. Oh well. Just more evidence that @martindholmes is right, the repos should never have been split.
Just stumbled across this while looking at @validUntil and I'm concerned about the wrong links.
We have a few <dataRef> elements with a @name attribute (not @key!) pointing at XSD datatypes directly. When the name of this datatype matches a TEI identifier, a wrong link is created on the TEI spec page. See e.g. dataFacet/@value pointing to the <string> element, or att.deprecated/@validUntil pointing to the <date> element.
My proposal would be to simply suppress these links and only output the name of the datatype, possibly prefixed with "xsd"
On my search for existing issues concerning datatype/dataFacet I stumbled across this one and I’d like to second @sydb:
Primarily the argument revolves around a principle TEI P5: if there is already a standard that does what we want, we use it rather than re-inventing the wheel. W3C already has a reasonable, standardized, widely promulgated system that XSD, RELAX NG, and Schematron validators know how to use.
in keeping the xs:datatpye connection definite.
I’m working a lot with ODD to help me build other custom schemas and I like that easy way of referencing XML Schema datatypes without having to replicate them in my ODD. So +1 for not purifying them!