music-encoding icon indicating copy to clipboard operation
music-encoding copied to clipboard

build: add schematron validation target

Open musicEnfanthen opened this issue 1 year ago • 21 comments

This PR adds an ant target to the build.xml that allows to trigger schematron validation of the canonicalized source file. It automatically runs after creating the canoncicalized source. It uses the great ant version of schxslt developed by @dmj (https://github.com/schxslt/schxslt).

musicEnfanthen avatar Feb 04 '24 20:02 musicEnfanthen

Converting to draft for several reasons:

  • schematron validation runs perfectly fine when triggered from inside Docker image; running it locally, however, results in java error message, at least for me:
validate-source:
[schematron] Generating validation stylesheet for Schematron 'D:\Repositories\MusicEncoding\fork_music-encoding\source\validation\mei-source.sch'
[schematron] SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
[schematron] SLF4J: Defaulting to no-operation (NOP) logger implementation
[schematron] SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
  • should still be tested on different os and architectures
  • does not work with the mei-customizsations.sch rules at the moment (at least here) due to a path issue in line: <sch:let name="mei.source.path" value=" resolve-uri('../mei-source.xml')"/> (but I propose to fix this in another step).

musicEnfanthen avatar Feb 04 '24 20:02 musicEnfanthen

So the logger issue is solved (by adding slf4j jar to mei.classpath), but it still does not want to generate validation stylesheet locally on Windows.

In the docker environment (Ubuntu), all runs fine:

validate-source:
[schematron] Generating validation stylesheet for Schematron '/opt/docker-mei/music-encoding/source/validation/mei-source.sch'
[schematron] Validating '/opt/docker-mei/music-encoding/build/mei-source_canonicalized_v5.1-dev.xml'
[schematron] The file '/opt/docker-mei/music-encoding/build/mei-source_canonicalized_v5.1-dev.xml' is valid

Locally on Windows:

validate-source:
[schematron] Generating validation stylesheet for Schematron 'D:\Repositories\MusicEncoding\fork_music-encoding\source\validation\mei-source.sch'

BUILD FAILED
D:\Repositories\MusicEncoding\fork_music-encoding\build.xml:267: Unable to compile validation stylesheet
        at name.dmaus.schxslt.ant.Task.execute(Task.java:96)
        at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:299)
        at name.dmaus.schxslt.Schematron.<init>(Schematron.java:59)
        at name.dmaus.schxslt.Schematron.<init>(Schematron.java:55)
        at name.dmaus.schxslt.ant.Task.execute(Task.java:94)
        ... 15 more
Caused by: javax.xml.transform.TransformerConfigurationException: jar:file:/D:/Repositories/MusicEncoding/fork_music-encoding/lib/schematron/ant-schxslt-1.9.5.jar!/xslt/2.0/include.xsl: line 40: Erforderliches Attribut "test" fehlt.
        at java.xml/com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl.newTemplates(Unknown Source)
        at java.xml/com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl.newTransformer(Unknown Source)
        at name.dmaus.schxslt.Compiler.createPipeline(Compiler.java:125)
        at name.dmaus.schxslt.Compiler.compile(Compiler.java:83)
        ... 20 more

Seems to be triggered by these lines in SchXslt: https://github.com/schxslt/schxslt/blob/dc2a30bc8b166fee16571f85804907afad239ba1/ant/src/main/java/name/dmaus/schxslt/ant/Task.java#L92-L96

Maybe @dmj has an idea what`s happening?

musicEnfanthen avatar Feb 10 '24 18:02 musicEnfanthen

This looks not okay:

Caused by: javax.xml.transform.TransformerConfigurationException: jar:file:/D:/Repositories/MusicEncoding/fork_music-encoding/lib/schematron/ant-schxslt-1.9.5.jar!/xslt/2.0/include.xsl: line 40: Erforderliches Attribut "test" fehlt. at java.xml/com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl.newTemplates(Unknown Source) at java.xml/com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl.newTransformer(Unknown Source) at name.dmaus.schxslt.Compiler.createPipeline(Compiler.java:125) at name.dmaus.schxslt.Compiler.compile(Compiler.java:83) ... 20 more

It uses the XSLT 2.0 stylesheets but with a XSLT 1.0 processor.

You may need to set the property javax.xml.transform.TransformerFactory to ``net.sf.saxon.TransformerFactoryImpl```.

dmj avatar Feb 10 '24 18:02 dmj

Ah, I see, it uses java's built in xalan factory instead of saxon, right? Thanks for the pointer, @dmj .

It is still odd since Saxon is on the classpath and used in any other transformation scenario with the ant build script. But there we seem to trigger it directly with a call to:

<java classname="${saxon.transform.class}" classpathref="mei.classpath" ...

In the taskdef for the schematron task (see here), we also reference the mei.classpath, but not the saxon transform class directly, since it needs to point to the schxslt class:

<taskdef name="schematron" classname="name.dmaus.schxslt.ant.Task" classpathref="mei.classpath"/>

From what I read here:

  • https://www.saxonica.com/html/documentation10/using-xsl/embedding/jaxp-transformation.html and here:
  • https://stackoverflow.com/a/11379291,

the best way to set the correct transform factory would be to do it directly in the Java code. Do I get it right?

musicEnfanthen avatar Feb 12 '24 10:02 musicEnfanthen

This looks like a classpath-related error. If I put the Saxon JAR on the classpath and run the validate-source everything seems okay:

export CLASSPATH=lib/saxon/saxon-he-11.5.jar
ant validate-source

Buildfile: /tmp/music-encoding/build.xml

init:
     [echo] xerces available: true
     [echo] saxon available: true
     [echo] schematron available: true
     [echo] prince available: false
     [echo] initialized

init-mei-classpath:
     [echo] mei.classpath set

validate-source:
     [echo] lib/schematron/ant-schxslt-1.9.5.jar
[schematron] Generating validation stylesheet for Schematron '/tmp/music-encoding/source/validation/mei-source.sch'
[schematron] Validating '/tmp/music-encoding/build/mei-source_canonicalized_v5.1-dev.xml'
[schematron] The file '/tmp/music-encoding/build/mei-source_canonicalized_v5.1-dev.xml' is valid

Thus I think we should look for a/the way to set the classpath.

dmj avatar Feb 12 '24 20:02 dmj

With the TEI stylesheets, we had the same problem, but we then added classpathrefs to several tasks and now can submit a classpath when called an external task, c.f. https://github.com/TEIC/Stylesheets/issues/544

Maybe this helps.

bwbohl avatar Feb 13 '24 07:02 bwbohl

@dmj Yes, you're right. When I run export CLASSPATH=lib/saxon/saxon-he-11.5.jar manually before the validation step, it works fine.

@bwbohl Any idea why it is not working without explicitly setting saxon to the classpath locally. Saxon is on the mei.classpath already which is also referred to in the task definition:

<taskdef name="schematron" classname="name.dmaus.schxslt.ant.Task" classpathref="mei.classpath"/>

musicEnfanthen avatar Feb 15 '24 11:02 musicEnfanthen

I think this bugreport gives an impression of the problem…

https://bz.apache.org/bugzilla/show_bug.cgi?id=6606

bwbohl avatar Jun 24 '24 23:06 bwbohl

Similar issue here: https://github.com/phax/ph-schematron/issues/78

bwbohl avatar Jun 25 '24 06:06 bwbohl

Thank you @bwbohl for investigating and your PR with an alternative approach in musicEnfanthen/music-encoding#4. Just tested locally and works fine. Would have detected #1480.

musicEnfanthen avatar Jun 26 '24 11:06 musicEnfanthen

However, when validating from within Docker image I get the following:

validate-source:
[schematron] Successfully parsed Schematron file '/opt/docker-mei/music-encoding/source/validation/mei-source.sch'
[schematron] Validating XML file '/opt/docker-mei/music-encoding/build/mei-source_canonicalized_v5.1-dev.xml' against Schematron rules from 'mei-source.sch' expecting success
[schematron] JAXP: using thread context class loader (java.net.URLClassLoader@7f31245a) for search
[schematron] JAXP: Looking up system property 'javax.xml.validation.SchemaFactory:http://www.w3.org/2001/XMLSchema'
[schematron] JAXP: The property is undefined.
[schematron] JAXP: found null in $java.home/conf/jaxp.properties

BUILD FAILED
/opt/docker-mei/music-encoding/build.xml:247: The following error occurred while executing this line:
/opt/docker-mei/music-encoding/build.xml:252: javax.xml.validation.SchemaFactoryConfigurationError: Provider for class javax.xml.validation.SchemaFactory cannot be created

musicEnfanthen avatar Jun 26 '24 11:06 musicEnfanthen

Maybe connected to this: https://bugs.openjdk.org/browse/JDK-8303531 ?

musicEnfanthen avatar Jun 26 '24 11:06 musicEnfanthen

OK, I see. When adding the schematron jar to the docker-mei build, everything runs fine with Docker, too. 🎉

musicEnfanthen avatar Jun 26 '24 14:06 musicEnfanthen

Strange side fact: Adding the schematron jar file only to the lib folder (by invoking schematron-download) from within Docker does not give the same result.

musicEnfanthen avatar Jun 26 '24 14:06 musicEnfanthen

Also: When I delete my local lib folder with all the jars, the Docker image does not recognize any java class (Saxon, etc.).

@bwbohl Can you double-check?

musicEnfanthen avatar Jun 26 '24 14:06 musicEnfanthen

did it work out?

bwbohl avatar Jun 26 '24 19:06 bwbohl

Yes, it did 🎉 (cf. https://github.com/music-encoding/music-encoding/pull/1428#discussion_r1655279419)

musicEnfanthen avatar Jun 27 '24 08:06 musicEnfanthen

great so we can move forward to implementing #790

bwbohl avatar Jun 27 '24 08:06 bwbohl

@musicEnfanthen Conflicts have to be resolved.

rettinghaus avatar Jul 01 '24 12:07 rettinghaus

@rettinghaus Thanks, yes, will do after #1460 is merged since it will introduce some more merge conflicts for this PR. (Because of the changes in the same file).

musicEnfanthen avatar Jul 01 '24 12:07 musicEnfanthen

@bwbohl @rettinghaus

I resolved the merge conflicts and updated the target dependencies, i.e., it will run validation (including canonicalization) before every build target by default. But now only once per ant call (previously it repeated validation on every build step).

Can you esp. check again if the replacement of the double-dependency in the dist target works for you locally (also with ant reset beforehand)? It works fine here without calling init-mei-classpath directly.

musicEnfanthen avatar Jul 26 '24 17:07 musicEnfanthen