woodstox icon indicating copy to clipboard operation
woodstox copied to clipboard

xml:lang attribute not handled correctly by XMLStreamReader2 using DTD validation

Open dsosnoski opened this issue 7 years ago • 1 comments

I'm using DTD schema validation with XMLStreamReader2:

        XMLInputFactory2 inFactory = (XMLInputFactory2)XMLInputFactory2.newInstance();
        inFactory.setProperty(XMLInputFactory2.IS_VALIDATING, Boolean.TRUE);
        inFactory.setProperty(XMLInputFactory2.SUPPORT_DTD, Boolean.TRUE);
        inFactory.setProperty(XMLInputFactory2.IS_NAMESPACE_AWARE, Boolean.FALSE);
        String dtdPath = path + ident;
        InputStream is = Utility.findResource(dtdPath);
        if (is == null) {
            throw new XMLStreamException("Unable to access DTD at path " + dtdPath);
        }
        final byte[] dtddata = Utility.readFully(is);
        XMLResolver resolver = new XMLResolver() {
            @Override
            public Object resolveEntity(String publicID, String systemID, String baseURI, String namespace) {
                return new ByteArrayInputStream(dtddata);
            }
        };
        inFactory.setXMLResolver(resolver);
        XMLStreamReader2 reader = (XMLStreamReader2)inFactory.createXMLStreamReader(in);
        // TODO: reuse schemas, since they're guaranteed threadsafe and immutable - use context to store?
        XMLValidationSchemaFactory schemaFactory =
            XMLValidationSchemaFactory.newInstance(XMLValidationSchema.SCHEMA_ID_DTD);
        XMLValidationSchema schema = schemaFactory.createSchema(new ByteArrayInputStream(dtddata));
        reader.validateAgainst(schema);

This seems to work well in general, but when I try a document which includes an xml:lang attribute I get:

com.ctc.wstx.exc.WstxValidationException: Element <FreeFormText> has no
attribute "xml:lang"
  at [row,col {unknown-source}]: [8,11]
    at
com.ctc.wstx.exc.WstxValidationException.create(WstxValidationException.java:50)
    at
com.ctc.wstx.sr.StreamScanner.reportValidationProblem(StreamScanner.java:580)
    at
com.ctc.wstx.sr.ValidatingStreamReader.reportValidationProblem(ValidatingStreamReader.java:383)
    at
com.ctc.wstx.sr.InputElementStack.reportProblem(InputElementStack.java:849)
    at
com.ctc.wstx.dtd.DTDValidatorBase.doReportValidationProblem(DTDValidatorBase.java:497)
    at
com.ctc.wstx.dtd.DTDValidatorBase.reportValidationProblem(DTDValidatorBase.java:479)
    at
com.ctc.wstx.dtd.DTDValidator.validateAttribute(DTDValidator.java:251)
    at
org.codehaus.stax2.validation.ValidatorPair.validateAttribute(ValidatorPair.java:78)
    ,,,

This occurs even though the DTD has the attribute specifically defined for that element:

<!ELEMENT FreeFormText
           ( #PCDATA ) >
<!ATTLIST FreeFormText
           xml:lang CDATA #IMPLIED >

The strangest part is that it appears to work correctly when using XMLStreamReader and XMLResolver (with a DOCTYPE in the document that references the DTD) - it only fails when using the reader.validateAgainst(schema) approach to set the validation DTD directly.

dsosnoski avatar Mar 10 '17 19:03 dsosnoski

@dsosnoski For some reason I seem unable to reproduce the issue with simplified version:

    public void testFullValidationIssue23() throws XMLStreamException
    {
        final String DTD = "<!ELEMENT FreeFormText (#PCDATA) >\n"
                +"<!ATTLIST FreeFormText  xml:lang CDATA #IMPLIED >\n";
        String XML = "<FreeFormText xml:lang='en-US'>foobar</FreeFormText>";
        XMLInputFactory f = getInputFactory();

        XMLValidationSchemaFactory schemaFactory =
                XMLValidationSchemaFactory.newInstance(XMLValidationSchema.SCHEMA_ID_DTD);
        XMLValidationSchema schema = schemaFactory.createSchema(new StringReader(DTD));
        XMLStreamReader2 sr = (XMLStreamReader2)f.createXMLStreamReader(
                new StringReader(XML));

        sr.validateAgainst(schema);
        while (sr.next() != END_DOCUMENT) {
        }
        sr.close();
    }

This does not seem to be due to simplication; I tried it with registration of XMLResolver too but that made no difference. This with master, which is same as 5.0.3.

So I may need an actual reproduction here....

cowtowncoder avatar Mar 24 '17 20:03 cowtowncoder