XML Schema validation with lxml
We are trying to setup an XML file validation process using the official MTConnect schema. We use for this the python library lxml like so:
from lxml import etree
# Load xml file to validate
tree = etree.parse('mtc_file.xml')
# Load the MTConnect XML schema
with open('MTConnectDevices_2.3_1.0.xsd') as f:
xmlschema_doc = etree.parse(f)
xmlschema = etree.XMLSchema(xmlschema_doc)
# Validate the MTConnect XML device model file
if not tree.validate(schema):
print('The XML device model file does not follow the MTConnect standart.')
We tried using the MTConnect schema MTConnectDevices_2.2.xsd, MTConnectDevices_2.3_1.0.xsd, MTConnectDevices_2.3.xsd and MTConnectDevices_2.3_1.0.xsd and they all ended up with errors when attempting to load them with xmlschema = etree.XMLSchema(xmlschema_doc).
For example for MTConnectDevices_2.2.xsd, lxml claims to find this error in the schema:
lxml.etree.XMLSchemaParseError: Element '{http://www.w3.org/2001/XMLSchema}any': The attribute 'notNamespace' is not allowed., line 7581
I was wondering if this is an issue of lxml or if something is indeed not 100% correct in MTConnectDevices_2.2.xsd.
Does anyone has experience with XML validation with other tools than lxml?
The xsd files without the 1.0 suffix use xml schema 1.1 and enable new features in 1.1. By new I mean 12 years old. Most of the xsd validators don’t support 1.1 yet. That’s why we also generate the 1.0 schemas. W3C XML Schema Definition Language (XSD) 1.1 Part 1: Structuresw3.org