woodstox
woodstox copied to clipboard
xml:lang attribute not handled correctly by XMLStreamReader2 using DTD validation
I'm using DTD schema validation with XMLStreamReader2:
XMLInputFactory2 inFactory = (XMLInputFactory2)XMLInputFactory2.newInstance();
inFactory.setProperty(XMLInputFactory2.IS_VALIDATING, Boolean.TRUE);
inFactory.setProperty(XMLInputFactory2.SUPPORT_DTD, Boolean.TRUE);
inFactory.setProperty(XMLInputFactory2.IS_NAMESPACE_AWARE, Boolean.FALSE);
String dtdPath = path + ident;
InputStream is = Utility.findResource(dtdPath);
if (is == null) {
throw new XMLStreamException("Unable to access DTD at path " + dtdPath);
}
final byte[] dtddata = Utility.readFully(is);
XMLResolver resolver = new XMLResolver() {
@Override
public Object resolveEntity(String publicID, String systemID, String baseURI, String namespace) {
return new ByteArrayInputStream(dtddata);
}
};
inFactory.setXMLResolver(resolver);
XMLStreamReader2 reader = (XMLStreamReader2)inFactory.createXMLStreamReader(in);
// TODO: reuse schemas, since they're guaranteed threadsafe and immutable - use context to store?
XMLValidationSchemaFactory schemaFactory =
XMLValidationSchemaFactory.newInstance(XMLValidationSchema.SCHEMA_ID_DTD);
XMLValidationSchema schema = schemaFactory.createSchema(new ByteArrayInputStream(dtddata));
reader.validateAgainst(schema);
This seems to work well in general, but when I try a document which includes an xml:lang attribute I get:
com.ctc.wstx.exc.WstxValidationException: Element <FreeFormText> has no
attribute "xml:lang"
at [row,col {unknown-source}]: [8,11]
at
com.ctc.wstx.exc.WstxValidationException.create(WstxValidationException.java:50)
at
com.ctc.wstx.sr.StreamScanner.reportValidationProblem(StreamScanner.java:580)
at
com.ctc.wstx.sr.ValidatingStreamReader.reportValidationProblem(ValidatingStreamReader.java:383)
at
com.ctc.wstx.sr.InputElementStack.reportProblem(InputElementStack.java:849)
at
com.ctc.wstx.dtd.DTDValidatorBase.doReportValidationProblem(DTDValidatorBase.java:497)
at
com.ctc.wstx.dtd.DTDValidatorBase.reportValidationProblem(DTDValidatorBase.java:479)
at
com.ctc.wstx.dtd.DTDValidator.validateAttribute(DTDValidator.java:251)
at
org.codehaus.stax2.validation.ValidatorPair.validateAttribute(ValidatorPair.java:78)
,,,
This occurs even though the DTD has the attribute specifically defined for that element:
<!ELEMENT FreeFormText
( #PCDATA ) >
<!ATTLIST FreeFormText
xml:lang CDATA #IMPLIED >
The strangest part is that it appears to work correctly when using XMLStreamReader and XMLResolver (with a DOCTYPE in the document that references the DTD) - it only fails when using the reader.validateAgainst(schema)
approach to set the validation DTD directly.
@dsosnoski For some reason I seem unable to reproduce the issue with simplified version:
public void testFullValidationIssue23() throws XMLStreamException
{
final String DTD = "<!ELEMENT FreeFormText (#PCDATA) >\n"
+"<!ATTLIST FreeFormText xml:lang CDATA #IMPLIED >\n";
String XML = "<FreeFormText xml:lang='en-US'>foobar</FreeFormText>";
XMLInputFactory f = getInputFactory();
XMLValidationSchemaFactory schemaFactory =
XMLValidationSchemaFactory.newInstance(XMLValidationSchema.SCHEMA_ID_DTD);
XMLValidationSchema schema = schemaFactory.createSchema(new StringReader(DTD));
XMLStreamReader2 sr = (XMLStreamReader2)f.createXMLStreamReader(
new StringReader(XML));
sr.validateAgainst(schema);
while (sr.next() != END_DOCUMENT) {
}
sr.close();
}
This does not seem to be due to simplication; I tried it with registration of XMLResolver
too but that made no difference.
This with master
, which is same as 5.0.3.
So I may need an actual reproduction here....