xmlutil icon indicating copy to clipboard operation
xmlutil copied to clipboard

Error decoding XML with entity references on Android when using ElementSerializer

Open micwallace opened this issue 3 years ago • 3 comments

I am not able to decode an XML with entity references on Android.

val xml = XML {
    repairNamespaces = true
    recommended()
}

var string = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
                "<StringWithMarkup xmlns=\"https://pubchem.ncbi.nlm.nih.gov/pug_view\">\n" +
                "    <String>Chloroacetic acid, &gt;=99% &lt; 100%</String>\n" +
                "</StringWithMarkup>";

val doc: Element = xml.decodeFromString(ElementSerializer, string)

Note that the example has both LT and GT entities. GT can be replaced before parsing, as it currently is in the xmlutil tests, but LT cannot as it would result in invalid XML.

Stacktrace as follows:

Creating entity references is not supported (or incorrect) in most browsers
java.lang.UnsupportedOperationException: Creating entity references is not supported (or incorrect) in most browsers
	at nl.adaptivity.xmlutil.DomWriter.entityRef(DomWriter.kt:208)
	at nl.adaptivity.xmlutil.EventType$ENTITY_REF.writeEvent(EventType.kt:162)
	at nl.adaptivity.xmlutil.XmlReaderUtil__XmlReaderKt.writeCurrent(XmlReader.kt:466)
	at nl.adaptivity.xmlutil.XmlReaderUtil.writeCurrent(Unknown Source)
	at nl.adaptivity.xmlutil.XmlWriterUtil__XmlWriterKt.writeElementContent(XmlWriter.kt:456)
	at nl.adaptivity.xmlutil.XmlWriterUtil.writeElementContent(Unknown Source)
	at nl.adaptivity.xmlutil.XmlWriterUtil__XmlWriterKt.writeElementContent(XmlWriter.kt:461)
	at nl.adaptivity.xmlutil.XmlWriterUtil.writeElementContent(Unknown Source)
	at nl.adaptivity.xmlutil.XmlWriterUtil__XmlWriterKt.writeElement(XmlWriter.kt:441)
	at nl.adaptivity.xmlutil.XmlWriterUtil.writeElement(Unknown Source)
	at nl.adaptivity.xmlutil.serialization.ElementSerializer.deserializeInput(ElementSerializer.kt:61)
	at nl.adaptivity.xmlutil.serialization.ElementSerializer.deserialize(ElementSerializer.kt:51)
	at nl.adaptivity.xmlutil.serialization.ElementSerializer.deserialize(ElementSerializer.kt:40)
	at nl.adaptivity.xmlutil.serialization.XmlDecoderBase$XmlDecoder.decodeSerializableValue(XMLDecoder.kt:195)
	at nl.adaptivity.xmlutil.serialization.XML.decodeFromReader(XML.kt:405)
	at nl.adaptivity.xmlutil.serialization.XML.decodeFromReader$default(XML.kt:380)
	at nl.adaptivity.xmlutil.serialization.XML.decodeFromString(XML.kt:346)
	at library.XmlParsingTest.deserializeXml(XmlParsingTest.kt:26)

micwallace avatar May 16 '22 04:05 micwallace

The tests normalize mainly to handle the differences in spacing, and the fact that > is not required to be replaced by entities. The problem here is in the implementation of EventType.writeEvent for entityRefs. I've fixed it in dev.

pdvrieze avatar May 17 '22 14:05 pdvrieze

Legend! I ended up finding and applying a similar fix today but wasn't sure if it would mess other things up, as I removed the exception and replaced it with the text implementation. Thanks for the clarification.

micwallace avatar May 17 '22 15:05 micwallace

It was always incorrect as the parameter to the entity is the name, not its text content. But it was reading the text content and treating it as if it was a name. Writing entities has the problem that you need to define them somehow (except for the handful of standard ones that are already dealt with by the regular text writing). I first thought it was a problem in the serialization or the parser, but it was actually in the event type.

pdvrieze avatar May 18 '22 07:05 pdvrieze