xsd-compare icon indicating copy to clipboard operation
xsd-compare copied to clipboard

Loop if XSD is recursive

Open svanteschubert opened this issue 2 years ago • 3 comments

The European e-Invoice standard uses OASIS UBL XML and UN/CEFACT CII XML, the latter is being used form its second 2016 release (D16B), you may find the XSD here: https://unece.org/DAM/cefact/xml_schemas/D16B_SCRDM__Subset__CII.zip

There is something nasty in, an element that can contain itself:

<xsd:complexType name="**GroupedWorkItemType**">
	<xsd:sequence>
		<xsd:element name="ID" type="udt:IDType"/>
		<xsd:element name="PrimaryClassificationCode" type="udt:CodeType" minOccurs="0" maxOccurs="unbounded"/>
		<xsd:element name="AlternativeClassificationCode" type="udt:CodeType" minOccurs="0" maxOccurs="unbounded"/>
		<xsd:element name="TypeCode" type="udt:CodeType" minOccurs="0" maxOccurs="unbounded"/>
		<xsd:element name="Comment" type="udt:TextType" minOccurs="0" maxOccurs="unbounded"/>
		<xsd:element name="TotalQuantity" type="udt:QuantityType" minOccurs="0"/>
		<xsd:element name="Index" type="udt:TextType" minOccurs="0"/>
		<xsd:element name="RequestedActionCode" type="udt:CodeType" minOccurs="0" maxOccurs="unbounded"/>
		<xsd:element name="PriceListItemID" type="udt:IDType" minOccurs="0"/>
		<xsd:element name="ContractualLanguageCode" type="udt:CodeType" minOccurs="0"/>
		<xsd:element name="TotalCalculatedPrice" type="ram:CalculatedPriceType" minOccurs="0" maxOccurs="unbounded"/>
		<xsd:element name="ItemGroupedWorkItem" type="ram:**GroupedWorkItemType**" minOccurs="0" maxOccurs="unbounded"/>
		<xsd:element name="ItemBasicWorkItem" type="ram:BasicWorkItemType" minOccurs="0" maxOccurs="unbounded"/>
		<xsd:element name="ChangedRecordedStatus" type="ram:RecordedStatusType" minOccurs="0" maxOccurs="unbounded"/>
		<xsd:element name="ActualWorkItemComplexDescription" type="ram:WorkItemComplexDescriptionType" minOccurs="0" maxOccurs="unbounded"/>
		<xsd:element name="ReferencedSpecifiedBinaryFile" type="ram:SpecifiedBinaryFileType" minOccurs="0" maxOccurs="unbounded"/>
	</xsd:sequence>
</xsd:complexType>

The trick to solve it was that the constructor is only called, when the element does not already exist. For this reason I added a Map <String, XsdElement> at the document root

public Map<String, XsdElement> allElements = new HashMap<>();

and exchanged the two constructor with two statical factory methods - a final tweak was to move the recursive init() out of the constructor (otherwise the case above will still loop) :-)

    public static XsdElement newXsdElement(XSElementDeclaration element, XsdDocument parent) {
        String ns = element.getNamespace();
        String name = element.getName();
        if(ns != null && !ns.isEmpty()){
            name = "{" + ns + "}" + name;
        }else{
            name = name;
        }
        if(parent.allElements.containsKey(name)){
            return parent.allElements.get(name);
        }else{
            XsdElement xsdElement = new XsdElement(element, parent);
            parent.allElements.put(name, xsdElement);
            xsdElement.init();
            return xsdElement;
        }
    }

    public static XsdElement newXsdElement(XSParticle elementDefinition, XsdElement parent) {
        XSElementDeclaration element = (XSElementDecl) elementDefinition.getTerm();
        String ns = element.getNamespace();
        String name = element.getName();
        if(ns != null && !ns.isEmpty()){
            name = "{" + ns + "}" + name;
        }else{
            name = name;
        }
        if(parent.document.allElements.containsKey(name)){
            return parent.document.allElements.get(name);
        }else{
            XsdElement xsdElement = new XsdElement(elementDefinition, parent);
            parent.document.allElements.put(name, xsdElement);
            xsdElement.init();
            return xsdElement;
        }
    }

Now the compare method is still looping - likely for similar reason - and I will check tomorrow. I plan to provide a patch if you like to (just answer) otherwise I might save the time..

svanteschubert avatar Mar 27 '23 18:03 svanteschubert

Instead of the ZIP you might obtain the XSD from https://github.com/ConnectingEurope/eInvoicing-EN16931/tree/master/cii/schema/D16B%20SCRDM%20(Subset)

The recursion can be found here: https://github.com/ConnectingEurope/eInvoicing-EN16931/blob/master/cii/schema/D16B%20SCRDM%20(Subset)/uncoupled%20clm/CII/uncefact/data/standard/CrossIndustryInvoice_ReusableAggregateBusinessInformationEntity_100pD16B.xsd#L304

svanteschubert avatar Mar 27 '23 18:03 svanteschubert

Hi

I've you want/could create a PR for it, it would be much appreciated. Otherwise, I'll see if I can find the time to patch it later this week.

yoep avatar Mar 27 '23 19:03 yoep

Hi yoep,

I did so now: https://github.com/yoep/xsd-compare/pull/10 It was a a pleasure to work on this and I have learned a lot about XSD (reading the spec) and on the Xerces API! Never heard of lombak before, very useful! Learned a lot from you, thank you!

PS: Please be gentle, when I might renamed a few things. For instance, using now oldNode and newNode (dropping the sometimes used prefix "original" in favor of "old"). In the end these are just names and perhaps personal taste, you could start changing things back, but perhaps have a quick chat ahead to understand the intentions ;-)

svanteschubert avatar Apr 06 '23 11:04 svanteschubert