dom4j icon indicating copy to clipboard operation
dom4j copied to clipboard

QName validation from 2.1.1 fails for namespaced attributes

Open wjcarpenter opened this issue 4 years ago • 3 comments

The QName validation added for issue #48 seems to open a regression if an attribute has a namespace qualifier. This XML parsing fails:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <sites xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
        <site>
            <id>1</id>
            <name>Default</name>
            <url_namespace></url_namespace>
            <user_quota xsi:nil="true"/>
            <content_admin_mode>2</content_admin_mode>
            <storage_quota xsi:nil="true"/>
            <sheet_image_enabled>true</sheet_image_enabled>
            <extract_encryption_mode>disabled</extract_encryption_mode>
            <materialized_views_mode>enable_selective</materialized_views_mode>
            <use_default_time_zone>true</use_default_time_zone>
        </site>
        <site>
            <id>4</id>
            <name>testsite_4432</name>
            <url_namespace>testsite_4432_url</url_namespace>
            <user_quota xsi:nil="true"/>
            <content_admin_mode>2</content_admin_mode>
            <storage_quota xsi:nil="true"/>
            <sheet_image_enabled>true</sheet_image_enabled>
            <extract_encryption_mode>disabled</extract_encryption_mode>
            <materialized_views_mode>enable_selective</materialized_views_mode>
            <use_default_time_zone>true</use_default_time_zone>
        </site>
    </sites>

with this exception:

Caused by: java.lang.IllegalArgumentException: Illegal character in local name: 'xsi:nil'.
        at org.dom4j.QName.validateNCName(QName.java:346)
        at org.dom4j.QName.<init>(QName.java:153)
        at org.dom4j.tree.QNameCache.createQName(QNameCache.java:245)
        at org.dom4j.tree.QNameCache.get(QNameCache.java:115)
        at org.dom4j.DocumentFactory.createQName(DocumentFactory.java:191)
        at org.dom4j.tree.NamespaceStack.createQName(NamespaceStack.java:392)
        at org.dom4j.tree.NamespaceStack.pushQName(NamespaceStack.java:374)
        at org.dom4j.tree.NamespaceStack.getAttributeQName(NamespaceStack.java:257)
        at org.dom4j.tree.AbstractElement.setAttributes(AbstractElement.java:454)
        at org.dom4j.io.SAXContentHandler.addAttributes(SAXContentHandler.java:899)
        at org.dom4j.io.SAXContentHandler.startElement(SAXContentHandler.java:241)
        at java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(AbstractSAXParser.java:510)
        at java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyElement(AbstractXMLDocumentParser.java:183)
        at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:1377)
        at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2710)
        at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:605)
        at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:534)
        at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:888)
        at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:824)
        at java.xml/com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
        at java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1216)
        at java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:635)
        at org.dom4j.io.SAXReader.read(SAXReader.java:494)
        ... 17 more

The failure did not occur with 2.1.0.

(sorry for the multiple edits ... for some reason I am unable to get version numbers correct on the first couple of tries :-) )

wjcarpenter avatar May 29 '20 02:05 wjcarpenter

I don't think it matters, but we set these feature options:

saxParserFactory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
saxParserFactory.setFeature("http://xml.org/sax/features/external-general-entities", false);
saxParserFactory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);

wjcarpenter avatar May 29 '20 02:05 wjcarpenter

I believe we have fixed this with this additional parser option saxParserFactory.setNamespaceAware(true);

jacalata avatar Jul 28 '20 20:07 jacalata

The question is, should dom4j not change anything, make a change to create a more meaningful error or implement a feature to allow creating doms without proper namespace info if the underlying reader is not namespace aware?

ecki avatar Nov 13 '20 19:11 ecki