FHIR icon indicating copy to clipboard operation
FHIR copied to clipboard

$convert between XML and JSON is inconsistent

Open gabriel0316 opened this issue 3 years ago • 2 comments

Describe the bug When using the $convert operation the transformation of a resource turns out to be inconsistent.

Environment 4.7.1

To Reproduce Steps to reproduce the behavior - XML to JSON:

  1. Take the following resource as XML and pass it on to the $convert operation:
    <CodeSystem xmlns="http://hl7.org/fhir">
        <name value="test-codesystem"/>
        <title value="Test Code System"/>
        <status value="active"/>
        <date value="2016-11-24"/>
        <description value="XXX&#10;&#10;XXX"/>
        <content value="complete"/>
        <concept>
            <code value="0"/>
            <display value="TEST CODE"/>
        </concept>
    </CodeSystem>
    
  2. The resulting JSON representation of the resource will be:
    {
        "resourceType": "CodeSystem",
        "name": "test-codesystem",
        "title": "Test Code System",
        "status": "active",
        "date": "2016-11-24",
        "description": "XXX\n\nXXX",
        "content": "complete",
        "concept": [
            {
                "code": "0",
                "display": "TEST CODE"
            }
        ]
    }
    

Steps to reproduce the behavior - JSON to XML:

  1. Take the following resource as JSON and pass it on to the $convert operation:
    {
        "resourceType": "CodeSystem",
        "name": "test-codesystem",
        "title": "Test Code System",
        "status": "active",
        "date": "2016-11-24",
        "description": "XXX\n\nXXX",
        "content": "complete",
        "concept": [
            {
                "code": "0",
                "display": "TEST CODE"
            }
        ]
    }
    
  2. The resulting XML representation of the resource will be:
    <CodeSystem xmlns="http://hl7.org/fhir">
        <name value="test-codesystem"/>
        <title value="Test Code System"/>
        <status value="active"/>
        <date value="2016-11-24"/>
        <description value="XXX
    
    XXX"/>
        <content value="complete"/>
        <concept>
            <code value="0"/>
            <display value="TEST CODE"/>
        </concept>
    </CodeSystem>
    

Expected behavior When converting from JSON to XML the \n should be transformed to &#10;. Then it would be possible to convert between XML and JSON consistently.

gabriel0316 avatar Apr 15 '22 09:04 gabriel0316

https://www.w3.org/TR/REC-xml/#AVNormalize has me agreeing this could be an issue. Based on my reading of that, an XML parser would parse

<description value="XXX&#10;&#10;XXX"/>

to an attribute value of

XXX

XXX

whereas it would parse

<description value="XXX

XXX"/>

to an attribute value of

XXX XXX

We don't do any special $convert logic, we just read the resource and then write it back out using XMLStreamWriter.writeAttribute.

Unfortunately, I don't see any supported configuration for preserving the newlines. Its basically the same issue that is reported here: https://stackoverflow.com/questions/8331364/how-to-preserve-whitespace-in-attributes-when-using-xmlstreamwriter

I'd lean toward moving this to the icebox and reporting a related issue upstream. How important is this one to you?

lmsurpre avatar Jul 18 '22 21:07 lmsurpre

I added a test at https://github.com/LinuxForHealth/FHIR/pull/3816/files#diff-8ec2e5b4061497aaf09998b194bb8e294ab785f38c512b5d64119ed4940e7463R23-R38 to demonstrate the root of the issue which has nothing to do with JSON and everything to do with XML.

lmsurpre avatar Jul 28 '22 14:07 lmsurpre