exist icon indicating copy to clipboard operation
exist copied to clipboard

Range Index error prevents storing Element containing CDATA that is preceeded by Text node, and corrupt db

Open adamretter opened this issue 2 years ago • 0 comments

It is not possible to store a document into eXist-db if the following two concerns align:

  1. The document contains an Element with two children:
    1. a Text node,
    2. followed by a CData Section.
  2. There is a Range Index configured on the Element for the Collection in which the document is to be stored.

Attempting this will cause an error like this:

Caused by: java.lang.NullPointerException
	at java.base/java.lang.System.arraycopy(Native Method)
	at org.exist.util.XMLString.append(XMLString.java:94)
	at org.exist.Indexer.endCDATA(Indexer.java:299)
	at org.exist.collections.triggers.SAXTrigger.endCDATA(SAXTrigger.java:188)
	at org.exist.collections.triggers.DocumentTriggers.endCDATA(DocumentTriggers.java:226)
	at org.apache.xerces.parsers.AbstractSAXParser.endCDATA(Unknown Source)
	at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanCDATASection(Unknown Source)
	at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
	at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
	at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
	at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
	at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
	at org.exist.collections.MutableCollection.lambda$8(MutableCollection.java:1142)
	at org.exist.collections.MutableCollection.lambda$12(MutableCollection.java:1229)
	at org.exist.collections.MutableCollection.storeXMLInternal(MutableCollection.java:1376)
	at org.exist.collections.MutableCollection.storeXmlDocument(MutableCollection.java:1229)
	at org.exist.collections.MutableCollection.storeDocument(MutableCollection.java:1148)
	at org.exist.collections.LockedCollection.storeDocument(LockedCollection.java:367)
	at org.exist.storage.NativeBroker.storeDocument(NativeBroker.java:2300)
	at org.exist.xmldb.LocalCollection.lambda$27(LocalCollection.java:642)
	at org.exist.xmldb.function.LocalXmldbCollectionFunction.apply(LocalXmldbCollectionFunction.java:50)
	at org.exist.xmldb.function.LocalXmldbCollectionFunction.apply(LocalXmldbCollectionFunction.java:50)
	at org.exist.xmldb.AbstractLocal.lambda$6(AbstractLocal.java:218)
	at org.exist.xmldb.AbstractLocal.lambda$5(AbstractLocal.java:152)
	at org.exist.xmldb.function.LocalXmldbFunction.apply(LocalXmldbFunction.java:48)
	at org.exist.xmldb.txn.bridge.InTxnLocalCollection.withDb(InTxnLocalCollection.java:58)
	at org.exist.xmldb.txn.bridge.InTxnLocalCollection.withDb(InTxnLocalCollection.java:52)
	at org.exist.xmldb.AbstractLocal.lambda$4(AbstractLocal.java:152)
	at org.exist.xmldb.LocalCollection.lambda$37(LocalCollection.java:808)
	at org.exist.xmldb.LocalCollection.storeXMLResource(LocalCollection.java:607)
	at org.exist.xmldb.LocalCollection.storeResource(LocalCollection.java:550)
	at org.exist.xmldb.LocalCollection.storeResource(LocalCollection.java:539)
	at org.exist.xquery.functions.xmldb.XMLDBLoadFromPattern.evalWithCollection(XMLDBLoadFromPattern.java:202)
...

Due to eXist-db's lack of ACID transactions semantics, the cause of the above error, as with any error that occurs whilst storing an XML document in eXist-db, will corrupt the database!


The reproducible Test Case is simple:

declare variable $collection-conf := document {
	<collection xmlns="http://exist-db.org/collection-config/1.0">
	    <index xmlns:int="http://services.parallelgraphics.com/vm/mmr/vm-interactivity-xml/all">
	        <!-- Range index -->
	        <create qname="entry" type="xs:string"/>
	    </index>
	</collection>
};

(: Create Collection, and store the Index config :)
xmldb:create-collection("/db", "test"),
xmldb:create-collection("/db/system/config/db", "test"),
xmldb:store("/db/system/config/db/test", "collection.xconf", $collection-conf),

(: Store the Document :)
xmldb:store-files-from-pattern(
	"/db/test",
	"/tmp",
	"test1.xml",
	(),
	fn:true()
)

The test1.xml document needed by the above query has the content:

<entry>something<![CDATA[Item]]></entry>

I have tested this with eXist-db 7.0.0-SNAPSHOT. I have not yet checked older versions of eXist-db, but having taken a quick look at the Git history, I believe this bug has been hiding for a very long time, and so will likely be present in at least 4.0.0 onwards.

adamretter avatar Mar 23 '23 19:03 adamretter