active-directory-b2c-custom-policy-starterpack
active-directory-b2c-custom-policy-starterpack copied to clipboard
Fixed "regexp error" when using libxml2 to load the xsd file
I'm currently working on a template system, where I generate the XML files, and I wanted to validate them. I inderictly use libxml2
from python via lxml
to validate the generated XML files with the TrustFrameowrkPolicy_0.3.0.0.xsd
schema file, but I get errors saying that line 3689
of the xsd file contains an invalid regular expression pattern.
From xmllint
:
regexp error : failed to compile: Wrong escape sequence, misuse of character '\'
regexp error : failed to compile: xmlFAParseCharClass: ']' expected
regexp error : failed to compile: xmlFAParseRegExp: extra characters
../policies/TrustFrameworkPolicy_0.3.0.0.xsd:3689: element pattern: Schemas parser error : Element '{http://www.w3.org/2001/XMLSchema}pattern': The value '^urn:[a-z0-9][a-z0-9-]{0,31}:[a-z0-9()+,\/\-.:=@;$_!*'%\/?#]+$' of the facet 'pattern' is not a valid regular expression.
WXS schema ../policies/TrustFrameworkPolicy_0.3.0.0.xsd failed to compile
From python (in WSL/Ubuntu):
Validating files...
Traceback (most recent call last):
File "/mnt/c/Users/pushrbx/PycharmProjects/aad-b2c-extensions/pman.py", line 169, in <module>
main()
File "/mnt/c/Users/pushrbx/PycharmProjects/aad-b2c-extensions/pman.py", line 161, in main
build(config)
File "/mnt/c/Users/pushrbx/PycharmProjects/aad-b2c-extensions/pman.py", line 99, in build
validate_built_xml_files()
File "/mnt/c/Users/pushrbx/PycharmProjects/aad-b2c-extensions/pman.py", line 45, in validate_built_xml_files
xmlschema = etree.XMLSchema(xmlschema_doc)
File "src/lxml/xmlschema.pxi", line 89, in lxml.etree.XMLSchema.__init__
lxml.etree.XMLSchemaParseError: Element '{http://www.w3.org/2001/XMLSchema}pattern': The value '^urn:[a-z0-9][a-z0-9-]{0,31}:[a-z0-9()+,\/\-.:=@;$_!*'%\/?#]+$' of the facet 'pattern' is not a valid regular expression., line 3689
You can also reproduce the issue with the command line tools of libxml2
:
- On ubuntu:
sudo apt install libxml2-utils
-
xmllint --schema TrustFrameworkPolicy_0.3.0.0.xsd TrustFrameworkBase.xml --noout
With python you can reproduce it the following way:
- Python 3.8+ is required.
-
pip install lxml==4.8.0 cython==0.29.28
- Create a python file
repro.py
- Write the following in the
repro.py
file:
from lxml import etree
with open("TrustFrameworkPolicy_0.3.0.0.xsd") as f:
xmlschema_doc = etree.parse(f)
xmlschema = etree.XMLSchema(xmlschema_doc)
with open("TrustFrameworkBase.xml"):
doc = etree.parse(xml_file)
xmlschema.assertValid(doc)
This PR addresses the issue. I need to test this with VSCode too, but I'm not using it on day to day basis, so it would be great if somebody could test this or point me to the right direction so I can set it up myself.
P.S.: Sorry about the whitespace changes.