exist
exist copied to clipboard
Range Index error prevents storing Element containing CDATA that is preceeded by Text node, and corrupt db
It is not possible to store a document into eXist-db if the following two concerns align:
- The document contains an Element with two children:
- a Text node,
- followed by a CData Section.
- There is a Range Index configured on the Element for the Collection in which the document is to be stored.
Attempting this will cause an error like this:
Caused by: java.lang.NullPointerException
at java.base/java.lang.System.arraycopy(Native Method)
at org.exist.util.XMLString.append(XMLString.java:94)
at org.exist.Indexer.endCDATA(Indexer.java:299)
at org.exist.collections.triggers.SAXTrigger.endCDATA(SAXTrigger.java:188)
at org.exist.collections.triggers.DocumentTriggers.endCDATA(DocumentTriggers.java:226)
at org.apache.xerces.parsers.AbstractSAXParser.endCDATA(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanCDATASection(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at org.exist.collections.MutableCollection.lambda$8(MutableCollection.java:1142)
at org.exist.collections.MutableCollection.lambda$12(MutableCollection.java:1229)
at org.exist.collections.MutableCollection.storeXMLInternal(MutableCollection.java:1376)
at org.exist.collections.MutableCollection.storeXmlDocument(MutableCollection.java:1229)
at org.exist.collections.MutableCollection.storeDocument(MutableCollection.java:1148)
at org.exist.collections.LockedCollection.storeDocument(LockedCollection.java:367)
at org.exist.storage.NativeBroker.storeDocument(NativeBroker.java:2300)
at org.exist.xmldb.LocalCollection.lambda$27(LocalCollection.java:642)
at org.exist.xmldb.function.LocalXmldbCollectionFunction.apply(LocalXmldbCollectionFunction.java:50)
at org.exist.xmldb.function.LocalXmldbCollectionFunction.apply(LocalXmldbCollectionFunction.java:50)
at org.exist.xmldb.AbstractLocal.lambda$6(AbstractLocal.java:218)
at org.exist.xmldb.AbstractLocal.lambda$5(AbstractLocal.java:152)
at org.exist.xmldb.function.LocalXmldbFunction.apply(LocalXmldbFunction.java:48)
at org.exist.xmldb.txn.bridge.InTxnLocalCollection.withDb(InTxnLocalCollection.java:58)
at org.exist.xmldb.txn.bridge.InTxnLocalCollection.withDb(InTxnLocalCollection.java:52)
at org.exist.xmldb.AbstractLocal.lambda$4(AbstractLocal.java:152)
at org.exist.xmldb.LocalCollection.lambda$37(LocalCollection.java:808)
at org.exist.xmldb.LocalCollection.storeXMLResource(LocalCollection.java:607)
at org.exist.xmldb.LocalCollection.storeResource(LocalCollection.java:550)
at org.exist.xmldb.LocalCollection.storeResource(LocalCollection.java:539)
at org.exist.xquery.functions.xmldb.XMLDBLoadFromPattern.evalWithCollection(XMLDBLoadFromPattern.java:202)
...
Due to eXist-db's lack of ACID transactions semantics, the cause of the above error, as with any error that occurs whilst storing an XML document in eXist-db, will corrupt the database!
The reproducible Test Case is simple:
declare variable $collection-conf := document {
<collection xmlns="http://exist-db.org/collection-config/1.0">
<index xmlns:int="http://services.parallelgraphics.com/vm/mmr/vm-interactivity-xml/all">
<!-- Range index -->
<create qname="entry" type="xs:string"/>
</index>
</collection>
};
(: Create Collection, and store the Index config :)
xmldb:create-collection("/db", "test"),
xmldb:create-collection("/db/system/config/db", "test"),
xmldb:store("/db/system/config/db/test", "collection.xconf", $collection-conf),
(: Store the Document :)
xmldb:store-files-from-pattern(
"/db/test",
"/tmp",
"test1.xml",
(),
fn:true()
)
The test1.xml document needed by the above query has the content:
<entry>something<![CDATA[Item]]></entry>
I have tested this with eXist-db 7.0.0-SNAPSHOT. I have not yet checked older versions of eXist-db, but having taken a quick look at the Git history, I believe this bug has been hiding for a very long time, and so will likely be present in at least 4.0.0 onwards.