jhove
jhove copied to clipboard
XML validation ignores schema files configured in jhove.conf
Dev Effort
2D
Description
According to the documentation of the XML module JHOVE can be configured to use local XML schema files when validating elements from a given XML namespace with a config file entry like this (example taken from the default config file):
<module>
<class>edu.harvard.hul.ois.jhove.module.XmlModule</class>
<param>schema=http://www.example.com/schema;/home/schemas/exampleschema.xsd</param>
</module>
However, JHOVE 1.18 seems to ignore these entries. Consider the following scenarios (assume the XML file is indeed valid):
-
XML file references a (local) schema file in a
schemaLocation
attribute; no config file entry → JHOVE uses the referenced file from theschemaLocation
attribute, "well-formed and valid". This is correct. -
XML file references a schema file in a
schemaLocation
attribute; config file contains entry that maps another local schema file to the same namespace → JHOVE uses the referenced file from theschemaLocation
attribute, "well-formed and valid". This is arguable: I would prefer JHOVE to interpret the config file entry as overriding theschemaLocation
attribute. -
XML file references a nonexistent schema file in a
schemaLocation
attribute; config file contains entry that maps another, existing local schema file to the same namespace → JHOVE tries to use the referenced file from theschemaLocation
attribute, "well-formed, but not valid" because the schema file is not found. This is arguable: I would prefer JHOVE to interpret the config file entry as overriding theschemaLocation
attribute. -
XML file does not reference a schema file; config file contains entry that maps an existing local schema file to the namespace → JHOVE does not use any schema file at all, "well-formed, but not valid". This is unnecessary: There is no conflict between two schema files (like in scenarios 2 and 3), so why does JHOVE not use the configured schema file?
Thanks for any clarification, Martin
XML validation ignores schema files configured in jhove.conf #314 - Assigned to TBA
nothing to see there, but Issue interesting for me as well
I implemented this here: https://github.com/UW-Madison-Library/jhove/compare/435f23381ff78357de3fbbc424952e1b7a3c31af..48d083b2394fc8b9426d9567744eff519284b830