XMLCoder icon indicating copy to clipboard operation
XMLCoder copied to clipboard

Optionally expose indexes for interleaved elements

Open liambutler-lawrence opened this issue 2 years ago • 4 comments

This PR adds a new public type, XMLPositionIndexed. This type can be used within a decoding tree to retain the indexing information of the private type KeyedStorage.

Why is this necessary?

This library already provides a way to retrieve the text value of an XML element, by specifying a coding key of an empty string.

However, in many XML documents, sub-nodes are nested at meaningful positions within the text value. For example, ARM's machine-readable XML documentation contains the following node:

<pstext>
    constant bits(16) <anchor>ASID_NONE</anchor> = <a>Zeros</a>();
</pstext>

In the current version of this library, this node's elements can be parsed into 3 arrays:

  • the anchor sub-nodes in order,
  • the a sub-nodes in order,
  • the value text segments in order

However, the relative positioning of these elements to each other is irretrievably lost.

With the new XMLPositionIndexed type, this relative positioning information is retained. Our Decodable model for the pstext node above can now look like this:

struct Text: Decodable {

    let valueSegments: [XMLPositionIndexed<String>]
    let links: [XMLPositionIndexed<Link>]
    let anchors: [XMLPositionIndexed<Link>]

    enum CodingKeys: String, CodingKey {
        case valueSegments = ""
        case links = "a"
        case anchors = "anchor"
    }

    struct Link: Decodable {
        let value: String

        enum CodingKeys: String, CodingKey {
            case value = ""
        }
    }
}

Each XMLPositionIndexed object contains the original value as well as an integer index that can be used in post-processing. As an example, we can easily merge all 3 arrays back together:

let mergedSegments = (
    text.valueSegments.map { ($0.index, $0.value) }
        + text.links.map { ($0.index, $0.value.value) }
        + text.anchors.map { ($0.index, $0.value.value) }
).sortedByKey { $0.0 }.map { $0.1 }

// mergedSegments = "constant bits(16) ASID_NONE = Zeros();"

liambutler-lawrence avatar Sep 09 '21 20:09 liambutler-lawrence

The new files need to be added to the existing .xcodeproj for CI to pass. I know this is a chore, but we're still keeping compatibility with Carthage (for now), which does require this Xcode project cruft to exist in the repository.

MaxDesiatov avatar Sep 14 '21 20:09 MaxDesiatov

Hey everyone! I'm using this awesome library to parse Vulkan API Registry XML and this feature would be very much appreciated :) Consider example:

<member optional="true">const <type>void</type>*            <name>pNext</name></member>

This line defines a member of a structure and it's type. Having indices for interleaved element would allow my code to correctly parse the underlying type of the member. Right now I'm storing those in array of strings (which worked correctly) due to the fact that I'm not looking into any other member except pNext right now, but still would be nice to have for future

smumriak avatar Jan 19 '22 09:01 smumriak

I'm glad to merge this as soon as we have unit-test coverage for this. Not sure though if the OP has abandoned this PR, but I didn't have time to add coverage myself. Feel free to create a new PR if you're interested in pushing this forward.

MaxDesiatov avatar Jan 19 '22 11:01 MaxDesiatov

Hi @liambutler-lawrence, would you be interested to move this PR forward? I'm cleaning up the repository, and if this PR is abandoned and outdated, I'm inclined to close it. Thanks!

MaxDesiatov avatar Jul 17 '22 16:07 MaxDesiatov

I'm closing this PR as abandoned, please feel free to reopen otherwise.

MaxDesiatov avatar Aug 15 '22 15:08 MaxDesiatov