jhove icon indicating copy to clipboard operation
jhove copied to clipboard

XML validation ignores schema files configured in jhove.conf

Open marhop opened this issue 6 years ago • 3 comments

Dev Effort

2D

Description

According to the documentation of the XML module JHOVE can be configured to use local XML schema files when validating elements from a given XML namespace with a config file entry like this (example taken from the default config file):

<module>
    <class>edu.harvard.hul.ois.jhove.module.XmlModule</class>
    <param>schema=http://www.example.com/schema;/home/schemas/exampleschema.xsd</param>
</module>

However, JHOVE 1.18 seems to ignore these entries. Consider the following scenarios (assume the XML file is indeed valid):

  1. XML file references a (local) schema file in a schemaLocation attribute; no config file entry → JHOVE uses the referenced file from the schemaLocation attribute, "well-formed and valid". This is correct.

  2. XML file references a schema file in a schemaLocation attribute; config file contains entry that maps another local schema file to the same namespace → JHOVE uses the referenced file from the schemaLocation attribute, "well-formed and valid". This is arguable: I would prefer JHOVE to interpret the config file entry as overriding the schemaLocation attribute.

  3. XML file references a nonexistent schema file in a schemaLocation attribute; config file contains entry that maps another, existing local schema file to the same namespace → JHOVE tries to use the referenced file from the schemaLocation attribute, "well-formed, but not valid" because the schema file is not found. This is arguable: I would prefer JHOVE to interpret the config file entry as overriding the schemaLocation attribute.

  4. XML file does not reference a schema file; config file contains entry that maps an existing local schema file to the namespace → JHOVE does not use any schema file at all, "well-formed, but not valid". This is unnecessary: There is no conflict between two schema files (like in scenarios 2 and 3), so why does JHOVE not use the configured schema file?

Thanks for any clarification, Martin

marhop avatar Mar 12 '18 14:03 marhop

XML validation ignores schema files configured in jhove.conf #314 - Assigned to TBA

nothing to see there, but Issue interesting for me as well

DinoAGW avatar Jul 08 '21 15:07 DinoAGW

I implemented this here: https://github.com/UW-Madison-Library/jhove/compare/435f23381ff78357de3fbbc424952e1b7a3c31af..48d083b2394fc8b9426d9567744eff519284b830

pwinckles avatar May 02 '22 15:05 pwinckles