woodstox
woodstox copied to clipboard
GenericMsvValidator.getAttributeType(int) always returns null, causing a NullPointerException in com.sun.org.apache.xalan.internal.xsltc.trax.SAX2DOM
When reading a DOM document form a StAXSource backed by a validating XMLStreamReader2, com.sun.org.apache.xalan.internal.xsltc.trax.SAX2DOM will throw a NullPointerException when trying to process attributes. This seems to be caused by GenericMsvValidator.getAttributeType(int) always returning a null reference for attribute type, which SAX2DOM is unprepared to handle.
The exception stack trace:
java.lang.NullPointerException
at com.sun.org.apache.xalan.internal.xsltc.trax.SAX2DOM.startElement(SAX2DOM.java:204)
at com.sun.org.apache.xml.internal.serializer.ToXMLSAXHandler.closeStartTag(ToXMLSAXHandler.java:208)
at com.sun.org.apache.xml.internal.serializer.ToSAXHandler.flushPending(ToSAXHandler.java:281)
at com.sun.org.apache.xml.internal.serializer.ToXMLSAXHandler.startElement(ToXMLSAXHandler.java:650)
at com.sun.org.apache.xalan.internal.xsltc.trax.StAXStream2SAX.handleStartElement(StAXStream2SAX.java:319)
at com.sun.org.apache.xalan.internal.xsltc.trax.StAXStream2SAX.bridge(StAXStream2SAX.java:145)
at com.sun.org.apache.xalan.internal.xsltc.trax.StAXStream2SAX.parse(StAXStream2SAX.java:101)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transformIdentity(TransformerImpl.java:688)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:737)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:351)
Tested with:
JRE 1.8.0_141-b15 (x64) com.fasterxml.woodstox:woodstox-core:5.0.3 net.java.dev.msv:msv-core:2013.6.1
Code to reproduce the error:
File xmlFile = new File("Test.xml");
File schemaFile = new File("Test.xsd");
validatAgainst(new File(xmlFile.toURI()), new File(schemaFile.toURI()));
XMLInputFactory2 xmlInputFactory = (XMLInputFactory2) XMLInputFactory2.newFactory();
xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, true);
xmlInputFactory.setProperty(XMLInputFactory.IS_VALIDATING, true);
XMLValidationSchema xmlValidationSchema = XMLValidationSchemaFactory
.newInstance(XMLValidationSchema.SCHEMA_ID_W3C_SCHEMA).createSchema(schemaFile);
XMLStreamReader2 xmlStreamReader = (XMLStreamReader2) xmlInputFactory.createXMLStreamReader(xmlFile);
xmlStreamReader.validateAgainst(xmlValidationSchema);
Transformer transformer = TransformerFactory.newInstance().newTransformer();
while (xmlStreamReader.hasNext()) {
xmlStreamReader.next();
if (xmlStreamReader.getEventType() == XMLStreamConstants.START_ELEMENT) {
transformer.reset();
DOMResult result = new DOMResult();
transformer.transform(new StAXSource(xmlStreamReader), result);
}
}
Test.xsd:
<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.example.org/Test"
xmlns:tns="http://www.example.org/Test" elementFormDefault="qualified">
<element name="test">
<complexType>
<attribute name="attr" type="string" />
</complexType>
</element>
</schema>
Test.xml
<?xml version="1.0" encoding="UTF-8"?>
<t:test xmlns:t="http://www.example.org/Test" attr="value" />
edit: Fixed broken link.
Ok that is possible, but as per XMLValidator
interfaces Javadoc:
/**
* Method for getting schema-specified type of an attribute, if
* information is available. If not, validators can return
* null to explicitly indicate no information was available.
*/
public abstract String getAttributeType(int index);
so null
is a valid value to return and it would seem that caller needs to handle it properly.
So I am not sure what Woodstox could do here.
SAX seems to be operating under the assumption that "CDATA" will be reported as type for attributes where the parser provides none.
@ndru83 I can accept that wrt Sax parser implementation. But sample code above specifically refers to XMLValidator
for which null
is specified as value to return.
So: I would be happy to change return value for Woodstox SAX reader implementation, but reproduction as-is unfortunately does not show that code path.
I'm closing this issue as XMLStreamReader.getAttributeType(int)
also doesn't seem to explicitly prohibit implementations from returning a null
value for unknown attribute types.
I've filed a bug report on the JDK side instead to to address the missing null check there. (JDK-8202426)
Thank you for filing the JDK bug!
Hi @ndru83. I have no way to post anything on bugs.openjdk.java.net. I see that fix is included only in java 11. And I tested it on new java 11 build and it works.
Are you able to ask if fix could be backported to older java versions?
Hi @cowtowncoder
How bad would it be to just return "CDATA".intern() in getAttributeType(int index) of com.ctc.wstx.msv.GenericMsvValidator?
I'm also having problems because of that bug in Java. I see it was already fixed in https://bugs.openjdk.java.net/browse/JDK-8202426, but there is no backport java <11. It was fixed by treating nulls as CDATA.
@mkozioro I filed a request for enhancement to have fixes JDK-8202426 and JDK-8201138 backported to Java 8. I wouldn't hold my breath though: Public updates for Java 8 will end some time this September with the release of Java 11 LTS, so wouldn't be surprised if they decided against it. :(
@ndru83 Thanks all your help. We will see. Maybe we will be lucky :)
@ndru83 In theory, Java 8 will be still updated. https://blogs.oracle.com/java-platform-group/extension-of-oracle-java-se-8-public-updates-and-java-web-start-support
@mkozioro I would not be against this as long as it would be a new property to set, so as not to change existing behavior. Filing a PR would be great as I am swamped with other work right now, but hoping to get new Woodstox release out relatively soon, for other fixes.