xmlutil icon indicating copy to clipboard operation
xmlutil copied to clipboard

How to parse a tag which can have multiple names in a single property

Open sdipendra opened this issue 2 years ago • 4 comments
trafficstars

How to parse a tag which can have multiple names.

Specifically for example: For a tag named "link:Stat"

some of my XML documents have fully qualified name: <link:Stat></link:Stat> some of my XML documents just have: <Stat></Stat> without the namespace

No document has both formats.

I want them to be mapped to the same single property: val stat: Stat

How can I achieve this? Thanks!

sdipendra avatar Jun 23 '23 02:06 sdipendra

There are two approaches. One is currently broken (I've fixed it): adding a custom handler for unknown content in the policy. The other is to have a filter on the parser that just remaps tags. A final option for your case is to override the mechanism by which the policy maps kotlin types to tag names. This is global, but can allow you to use the same serializer with a different policy to parse either.

pdvrieze avatar Jun 23 '23 10:06 pdvrieze

For the third approach, I'm trying to override the policy behaviour but I'm unable to identify the method that I should override.

I've created a failing test setup for the same if you can point the policy method that I should override that will be great.

In the current setup the first test case with prefix passes & the second test case without prefix fails.

package com.kodepad.xml

import kotlinx.serialization.Serializable
import kotlinx.serialization.decodeFromString
import nl.adaptivity.xmlutil.ExperimentalXmlUtilApi
import nl.adaptivity.xmlutil.serialization.DefaultXmlSerializationPolicy
import nl.adaptivity.xmlutil.serialization.XML
import nl.adaptivity.xmlutil.serialization.XmlElement
import nl.adaptivity.xmlutil.serialization.XmlSerialName
import nl.adaptivity.xmlutil.serialization.XmlSerializationPolicy
import nl.adaptivity.xmlutil.serialization.XmlValue
import org.junit.jupiter.api.Test
import org.slf4j.LoggerFactory
import kotlin.test.assertEquals

@OptIn(ExperimentalXmlUtilApi::class)
internal class XMLUtilFailingTest {
    @Serializable
    @XmlSerialName(
        namespace = "http://www.kodepad.com/xml/equipment",
        prefix = "equipment",
        value = "device",
    )
    data class Device(
        @XmlElement(value = true) val stat: Stat?,
    )

    @Serializable
    @XmlSerialName(
        namespace = "http://www.kodepad.com/xml/link",
        prefix = "link",
        value = "Stat",
    )
    data class Stat(
        @XmlValue val value: String,
    )

    class XmlSerializationPolicyProxy(xmlSerializationPolicy: XmlSerializationPolicy) :
        XmlSerializationPolicy by xmlSerializationPolicy {
        // todo: Override method to map "Stat" to "link:Stat"
    }

    companion object {
        private val log = LoggerFactory.getLogger(this::class.java.declaringClass.name)

        private val expectedValue = Device(Stat("WORKING"))
    }

    private val xml = XML {
        this.policy = XmlSerializationPolicyProxy(
            DefaultXmlSerializationPolicy(
                false, encodeDefault = XmlSerializationPolicy.XmlEncodeDefault.NEVER
            )
        )
    }

    @Test
    fun `parse xml with prefix`() {
        val xmlString =
            "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" + "<equipment:device xmlns:equipment=\"http://www.kodepad.com/xml/equipment\"\n" + "                  xmlns:link=\"http://www.kodepad.com/xml/link\">\n" + "    <link:Stat>WORKING</link:Stat>\n" + "</equipment:device>\n"

        val device = xml.decodeFromString<Device>(xmlString)
        log.info("device: $device")

        assertEquals(expectedValue, device)
    }

    @Test
    fun `parse xml without prefix`() {
        val xmlString =
            "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" + "<equipment:device xmlns:equipment=\"http://www.kodepad.com/xml/equipment\"\n" + "                  xmlns:link=\"http://www.kodepad.com/xml/link\">\n" + "    <Stat>WORKING</Stat>\n" + "</equipment:device>\n"

        val device = xml.decodeFromString<Device>(xmlString)
        log.info("device: $device")

        assertEquals(expectedValue, device)
    }
}

Included dependencies:

plugins {
    kotlin("jvm") version "1.8.20"
    kotlin("plugin.serialization") version "1.8.20"
}

dependencies {
    // Serialization
    implementation("org.jetbrains.kotlinx:kotlinx-serialization-json:1.5.0")
    implementation("io.github.pdvrieze.xmlutil:core:0.86.0")
    implementation("io.github.pdvrieze.xmlutil:serialization:0.86.0")
}

sdipendra avatar Jun 23 '23 19:06 sdipendra

Unfortunately there is a bug in the handling (now fixed in dev). What should be overridden is handleUnknownContentRecovering. To see how this works look at: https://github.com/pdvrieze/xmlutil/blob/638c85bb89b93de33c9c6a3282a08cd6cee815c7/serialization/src/commonTest/kotlin/nl/adaptivity/xml/serialization/RecoveryTest.kt#L63-L82

and:

https://github.com/pdvrieze/xmlutil/blob/638c85bb89b93de33c9c6a3282a08cd6cee815c7/serialization/src/commonMain/kotlin/nl/adaptivity/xmlutil/serialization/XmlSerializationPolicy.kt#L233-L260

But please note that this is broken in master (the helper function is new - but more significantly recovery for elements is broken (it fails to read the end tag))

pdvrieze avatar Jun 24 '23 12:06 pdvrieze

Checked on dev. This works for my use case. Thank you.

One suggestion though instead of having a specific method for handling null namespace wouldn't it better to have a method that provides ability to map a parsed QName to some other QName. That will enable the null namespace and many other use cases as well.

sdipendra avatar Jun 25 '23 22:06 sdipendra