ODD validation
work in progress: this still needs to consider integration into GH Actions
fixes #789 (as side effect) fixes #790
If it's helpful for reference, we did have schematron validation available previously. There were a number of XSLTs that would extract the rules from the compiled RNG and then run the validation. It was really clunky, but it seemed to work.
Here's a link to the shell script setup:
https://github.com/music-encoding/music-encoding/blob/bdfb695be193ca8619949817db18b6a0a2c0cf3b/build.sh#L37-L97
The XSLTs were copied from teh intarnet:
https://github.com/music-encoding/music-encoding/tree/bdfb695be193ca8619949817db18b6a0a2c0cf3b/utils/schematron
It would be easy to run the shell script from GH actions. The question is if it needs to be updated for any reason (e.g. including the changes made by @kepper )
at ODD Friday, we agreed that we pull in the schema right now (after the meeting), and will try to set up a GH Action independently of that. The GH Action could fail when it's not valid.
Ok, I will setup a test repo to see if we can get the build.sh script by @ahankinson running there from an action.
@ahankinson Could you please indicate which input files (rngfile, meifile) would be best for the test?
Maybe I misunderstood the question: The MEI schematron rules are extracted from the compiled RNG schema. So the process is:
- Take the ODD, make an RNG
- Take the RNG, extract the schematron
- Do a bunch of hocus pocusy XML shenanigans to run the schematron rules through a bunch of stylesheets and satellites and underwater cables and who knows what else to eventually arrive at an XSLT that can generate 👍 or 👎 based on the rules
- Apply that XSLT to an input MEI file, which will validate the MEI as following the schematron rules.
But maybe you're doing something different here? Are you doing the same thing, but for the TEI rules for validating the schematron rules in the MEI ODD?
Hi @ahankinson, if I got you right, the job here is actually slightly different. This is not to extract the Schematron rules that are part of MEI, but to control that the ODD used to write MEI itself is conforming to some expectations. This is not a Schematron a user of MEI would ever see or have to use. So the idea would be to have a flag that says whether a change to the sources conforms to our expectations, or if it introduces problems. For instance, this checks whether the guidelines have a link to a non-existing element (because it's mispelled or some such). Does this help?
Yeah, that's what I thought. I expect, though, that you would need to run through the same process with the Schematron rules you're writing, though -- run the Schematron rules through an XSLT to generate an XSLT that is then used to verify the ODD that contains the schematron rules that you want to verify by extracting them into an XSLT to generate an XSLT that is then used to verify the MEI XML that someone would write.
Confusing? :-D
what about using TEI schemata for a first implementation e.g.:
- https://tei-c.org/Vault/P5/4.2.2/xml/tei/custom/schema/relaxng/tei_odds.rng
- https://tei-c.org/Vault/P5/4.2.2/xml/tei/custom/schema/relaxng/tei_customization.rng
Wouldn't that be used to validate the schematron rules in a TEI file? Not validate the rules in the ODD file? I'm seeing rules like this in the tei_odds.rng:
<pattern xmlns="http://purl.oclc.org/dsdl/schematron"
id="tei_odds-att.datable.w3c-att-datable-w3c-from-constraint-rule-2">
<sch:rule xmlns:sch="http://purl.oclc.org/dsdl/schematron"
xmlns:rng="http://relaxng.org/ns/structure/1.0"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns="http://www.tei-c.org/ns/1.0"
context="tei:*[@from]">
<sch:report test="@notBefore" role="nonfatal">The @from and @notBefore attributes cannot be used together.</sch:report>
</sch:rule>
</pattern>
That looks like it's used to validate a TEI file.
2021-06-24 ODD Thursday:
The PR and its discussion are confusing to some extent, especially as there is no summary description of the changes. If we understand correctly, the general idea is having validation and test run on schemata changed by PRs. This is generally a very good idea in order to avoid simple errors like the one resulting in #789! The roadmap decided in #790 is:
- introduce schematron-based validation for mei source files (#790)
- introduce validation of the ODD files against the TEI schemata. The question arising here is which version of TEI are we talking about?
- re-enable running schema-tests against sample-files
- implement all in a GitHub Action that builds and validates and tests. This requires building every pull request with the build-part of the deploy.yml action; the build artefacts can be attached to the action run, the action should post validation results to the PR as comment. All this maybe should be designed as separate jobs, build nevertheless is a prerequisite for all jobs.
N.B. As there is a general requirement of running all this also locally we probably should move to a docker-based build scenarios.
concerning 2. of my above post:
- introduce validation of the ODD files against the TEI schemata. The question arising here is which version of TEI are we talking about?
The music-encoding repo hold a own version of the TEI schemata namely source/validation/mei_odds.rng this dates back to TEI v3.2.0 from 2017. While we're at it, shouldn't we upgrade the MEI source files to the latest version of TEI, being v4.2.2?
Yes, at some point, we should. That seems to be a rather significant change, though, that requires proper testing. So I'm inclined to schedule that for a later, dedicated meeting, like a dev workshop ;-)
ok, should try pure ODD at the same time then.
@kepper How is the status of this?
running this locally results in 57 consistency problems in the specs and guidelines. I would like to have this merged before starting to fix those errors.
Thanks for this. Now a question to our experts @bwbohl, @musicEnfanthen, and maybe @riedde: What would be necessary to have this as an automatic check when committing, and then automatically adding a flag to the repo?
The validation of the odd against a rng schema can, e.g., be done by a jing task. This task could also be integrated into the ant build file, if needed. Concerning flags: I'm not into this topic.
Ideally, the task would check on any incoming PRs, and then flag the PR if it fails.