exist icon indicating copy to clipboard operation
exist copied to clipboard

[BUG] collection config not always applied as expected

Open line-o opened this issue 2 years ago • 7 comments

Describe the bug

From the feedback I gathered from long-standing core developer colleagues:

Whenever a collection configuration changes, wether by storing copying or moving the configuration to a configuration collection, both of following must be true:

  1. xmldb:reindex($collection) has to be called explicitly to apply new indexes to existing data
  2. new indexes are applied to new data immediately

The first statement is not satisfied - tested in exist-db versions 6.2.0 and 7.0.0-SNAPSHOT.

  • For copied or moved collection configuration files reindexing will not apply the new indexes to existing data.
  • After storing a collection configuration the new indexes are applied immediately to existing data.

NOTE: I have not yet tested the second statement.

Expected behavior

As an application developer, I expect that indexes in a collection configuration files are applied in the same way regardless of the way those changes are made.

To Reproduce

initial testsuite first attempt at a testsuite that would highlight both observed inconsistencies left in just for completeness
module namespace taic="http://exist-db.org/xquery/range/test/apply-index-configuration";

import module namespace test="http://exist-db.org/xquery/xqsuite" at "resource:org/exist/xquery/lib/xqsuite/xqsuite.xql";

declare namespace stats="http://exist-db.org/xquery/profiling";

declare variable $taic:collection-name := 'apply-index-configuration';
declare variable $taic:collection-path := '/db/' || $taic:collection-name;
declare variable $taic:system-config-path := '/db/system/config';
declare variable $taic:collection-config-path := $taic:system-config-path || $taic:collection-path;


declare variable $taic:xconf-name := 'collection.xconf';
declare variable $taic:xconf :=
<collection xmlns="http://exist-db.org/collection-config/1.0">
    <index xmlns:xs="http://www.w3.org/2001/XMLSchema">
        <range>
            <create qname="node" type="xs:string"/>
        </range>
    </index>
</collection>
;

declare variable $taic:test-data :=
<root>
    <node>a</node>
    <node>b</node>
</root>
;

declare
    %test:setUp
function taic:setup () {
    xmldb:create-collection('/db', $taic:collection-name),
    xmldb:store($taic:collection-path, 'test.xml', $taic:test-data),
    xmldb:create-collection($taic:system-config-path || '/db', $taic:collection-name),
    xmldb:store($taic:collection-path, $taic:xconf-name, $taic:xconf)
};

declare
    %private
function taic:query () {
    collection($taic:collection-path)//node[.='a']
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function taic:copy () {
    let $_ :=
        xmldb:copy-resource(
            $taic:collection-path, $taic:xconf-name,
            $taic:collection-config-path, $taic:xconf-name)
    
    return taic:query()
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function taic:touch-after-copy () {
    let $_ :=
        xmldb:copy-resource(
            $taic:collection-path, $taic:xconf-name,
            $taic:collection-config-path, $taic:xconf-name)
    let $touch := xmldb:touch($taic:collection-config-path, $taic:xconf-name)
    return taic:query()
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function taic:copy-with-reindex () {
    let $_ :=
        xmldb:copy-resource(
            $taic:collection-path, $taic:xconf-name,
            $taic:collection-config-path, $taic:xconf-name)
    let $reindex := xmldb:reindex($taic:collection-path)
    return taic:query()
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function taic:move () {
    let $_ := (
        xmldb:move($taic:collection-path, $taic:collection-config-path, $taic:xconf-name),
        (: restore moved resource for following tests :)
        xmldb:store($taic:collection-path, $taic:xconf-name, $taic:xconf)
    )
    
    return taic:query()
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function taic:touch-after-move () {
    let $_ := (
        xmldb:move($taic:collection-path, $taic:collection-config-path, $taic:xconf-name),
        (: restore moved resource for following tests :)
        xmldb:store($taic:collection-path, $taic:xconf-name, $taic:xconf)
    )
    let $touch := xmldb:touch($taic:collection-config-path, $taic:xconf-name)
    return taic:query()
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function taic:move-with-reindex () {
    let $_ := (
        xmldb:move($taic:collection-path, $taic:collection-config-path, $taic:xconf-name),
        (: restore moved resource for following tests :)
        xmldb:store($taic:collection-path, $taic:xconf-name, $taic:xconf)
    )
    let $reindex := xmldb:reindex($taic:collection-path)    
    return taic:query()
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function taic:store () {
    let $_ := xmldb:store($taic:collection-config-path, $taic:xconf-name, $taic:xconf)

    return taic:query()
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function taic:store-with-reindex () {
    let $_ := xmldb:store($taic:collection-config-path, $taic:xconf-name, $taic:xconf)
    let $reindex := xmldb:reindex($taic:collection-path)
    return taic:query()
};

declare
    %test:tearDown
function taic:cleanup () {
    xmldb:remove($taic:collection-path),
    xmldb:remove($taic:collection-config-path)
};
--- UPDATE 1: test copied xconf ---

The following testsuite proves, that the indexes defined in a collection configuration resource copied to a configuration collection cannot be applied to existing data. Better isolation of tests operating on a copied xconf resource. They all fail.

module namespace aicc="http://exist-db.org/xquery/range/test/apply-index-copied-configuration";

import module namespace test="http://exist-db.org/xquery/xqsuite" at "resource:org/exist/xquery/lib/xqsuite/xqsuite.xql";

declare namespace stats="http://exist-db.org/xquery/profiling";

declare variable $aicc:collection-name := 'apply-index-copied-configuration';
declare variable $aicc:collection-path := '/db/' || $aicc:collection-name;
declare variable $aicc:system-config-path := '/db/system/config';
declare variable $aicc:collection-config-path := $aicc:system-config-path || $aicc:collection-path;


declare variable $aicc:xconf-name := 'collection.xconf';
declare variable $aicc:xconf :=
<collection xmlns="http://exist-db.org/collection-config/1.0">
    <index xmlns:xs="http://www.w3.org/2001/XMLSchema">
        <range>
            <create qname="node" type="xs:string"/>
        </range>
    </index>
</collection>
;

declare variable $aicc:test-data :=
<root>
    <node>a</node>
    <node>b</node>
</root>
;

declare
    %test:setUp
function aicc:setup () {
    xmldb:create-collection('/db', $aicc:collection-name),
    xmldb:store($aicc:collection-path, 'test.xml', $aicc:test-data),
    xmldb:create-collection($aicc:system-config-path || '/db', $aicc:collection-name),
    xmldb:store($aicc:collection-path, $aicc:xconf-name, $aicc:xconf),
    xmldb:copy-resource(
        $aicc:collection-path, $aicc:xconf-name,
        $aicc:collection-config-path, $aicc:xconf-name)
};

declare
    %private
function aicc:query () {
    collection($aicc:collection-path)//node[.='a']
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function aicc:after-copy () {
    aicc:query()
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function aicc:touch-after-copy () {
    let $touch := xmldb:touch($aicc:collection-config-path, $aicc:xconf-name)
    return aicc:query()
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function aicc:reindex-after-copy () {
    let $reindex := xmldb:reindex($aicc:collection-path)
    return aicc:query()
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function aicc:touch-and-reindex-after-copy () {
    let $touch := xmldb:touch($aicc:collection-config-path, $aicc:xconf-name)
    let $reindex := xmldb:reindex($aicc:collection-path)
    return aicc:query()
};

declare
    %test:tearDown
function aicc:cleanup () {
    xmldb:remove($aicc:collection-path),
    xmldb:remove($aicc:collection-config-path)
};
--- UPDATE 2: immediate application ---

The test suite below proves that when a xconf resource is stored the indexes in it will be applied immediately.

module namespace asic="http://exist-db.org/xquery/range/test/apply-stored-index-configuration";

import module namespace test="http://exist-db.org/xquery/xqsuite" at "resource:org/exist/xquery/lib/xqsuite/xqsuite.xql";

declare namespace stats="http://exist-db.org/xquery/profiling";

declare variable $asic:collection-name := 'apply-index-configuration';
declare variable $asic:collection-path := '/db/' || $asic:collection-name;
declare variable $asic:system-config-path := '/db/system/config';
declare variable $asic:collection-config-path := $asic:system-config-path || $asic:collection-path;


declare variable $asic:xconf-name := 'collection.xconf';
declare variable $asic:xconf :=
<collection xmlns="http://exist-db.org/collection-config/1.0">
    <index xmlns:xs="http://www.w3.org/2001/XMLSchema">
        <range>
            <create qname="node" type="xs:string"/>
        </range>
    </index>
</collection>
;

declare variable $asic:test-data :=
<root>
    <node>a</node>
    <node>b</node>
</root>
;

declare
    %test:setUp
function asic:setup () {
    xmldb:create-collection('/db', $asic:collection-name),
    xmldb:store($asic:collection-path, 'test.xml', $asic:test-data),
    xmldb:create-collection($asic:system-config-path || '/db', $asic:collection-name),
    xmldb:store($asic:collection-config-path, $asic:xconf-name, $asic:xconf)
};

declare
    %private
function asic:query () {
    collection($asic:collection-path)//node[.='a']
};

declare
    %test:stats
    %test:assertXPath("$result//stats:index[@type = 'new-range'][@optimization = 2]")
function asic:store () {
    asic:query()
};

declare
    %test:tearDown
function asic:cleanup () {
    xmldb:remove($asic:collection-path),
    xmldb:remove($asic:collection-config-path)
};

Screenshots If applicable, add screenshots to help explain your problem.

Context (please always complete the following information)

Build: eXist-7.0.0-SNAPSHOT (b032a424e9e92582938080d90b057a14c079df04) Java: 17.0.6 (Azul Systems, Inc.) OS: Mac OS X 13.5.2 (aarch64)

Additional context

  • How is eXist-db installed? built from source (7.0.0-SNAPSHOT) and run in docker (6.2.0)
  • Any custom changes in e.g. conf.xml? none

line-o avatar Oct 23 '23 09:10 line-o

On both tested systems the results were:

<testsuites>
    <testsuite package="http://exist-db.org/xquery/range/test/apply-index-configuration" timestamp="2023-10-23T11:17:42.223+02:00" tests="8" failures="4" errors="0" pending="0" time="PT0.065S">
        <testcase name="copy" class="taic:copy">
            <failure message="assertXPath failed." type="failure-error-code-1">$result//stats:index[@type = 'new-range'][@optimization = 2]</failure>
            <output>
                <stats:calls xmlns:stats="http://exist-db.org/xquery/profiling">
                    <stats:query source="String/-2662374561284807282" elapsed="0.0" calls="1"/>
                    <stats:query source="/db/apps/eXide/modules/run-test.xq" elapsed="0.002" calls="1"/>
                    <stats:function name="collection" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [43:5]"/>
                    <stats:function name="taic:copy" elapsed="0.0" calls="1" source="[unknown source] [14:35]"/>
                    <stats:function name="taic:query" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [55:12]"/>
                    <stats:function name="xmldb:copy-resource" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [51:9]"/>
                    <stats:function name="range:eq" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [43:46]"/>
                    <stats:index type="range" source="/db/test-copy-xconf.xq [43:46]" elapsed="0.0" calls="1" optimization="0"/>
                </stats:calls>
            </output>
        </testcase>
        <testcase name="copy-with-reindex" class="taic:copy-with-reindex">
            <failure message="assertXPath failed." type="failure-error-code-1">$result//stats:index[@type = 'new-range'][@optimization = 2]</failure>
            <output>
                <stats:calls xmlns:stats="http://exist-db.org/xquery/profiling">
                    <stats:query source="String/-2662374561284807282" elapsed="0.0" calls="1"/>
                    <stats:function name="xmldb:reindex" elapsed="0.003" calls="1" source="/db/test-copy-xconf.xq [78:21]"/>
                    <stats:function name="collection" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [43:5]"/>
                    <stats:function name="taic:copy-with-reindex" elapsed="0.005" calls="1" source="[unknown source] [14:35]"/>
                    <stats:function name="xmldb:copy-resource" elapsed="0.002" calls="1" source="/db/test-copy-xconf.xq [75:9]"/>
                    <stats:function name="range:eq" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [43:46]"/>
                    <stats:function name="taic:query" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [79:12]"/>
                    <stats:index type="range" source="/db/test-copy-xconf.xq [43:46]" elapsed="0.0" calls="1" optimization="0"/>
                </stats:calls>
            </output>
        </testcase>
        <testcase name="move" class="taic:move">
            <failure message="assertXPath failed." type="failure-error-code-1">$result//stats:index[@type = 'new-range'][@optimization = 2]</failure>
            <output>
                <stats:calls xmlns:stats="http://exist-db.org/xquery/profiling">
                    <stats:query source="String/-2662374561284807282" elapsed="0.0" calls="1"/>
                    <stats:function name="taic:query" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [92:12]"/>
                    <stats:function name="taic:move" elapsed="0.001" calls="1" source="[unknown source] [14:35]"/>
                    <stats:function name="collection" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [43:5]"/>
                    <stats:function name="xmldb:move" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [87:9]"/>
                    <stats:function name="range:eq" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [43:46]"/>
                    <stats:function name="xmldb:store" elapsed="0.001" calls="1" source="/db/test-copy-xconf.xq [89:9]"/>
                    <stats:index type="range" source="/db/test-copy-xconf.xq [43:46]" elapsed="0.0" calls="1" optimization="0"/>
                </stats:calls>
            </output>
        </testcase>
        <testcase name="move-with-reindex" class="taic:move-with-reindex">
            <failure message="assertXPath failed." type="failure-error-code-1">$result//stats:index[@type = 'new-range'][@optimization = 2]</failure>
            <output>
                <stats:calls xmlns:stats="http://exist-db.org/xquery/profiling">
                    <stats:query source="String/-2662374561284807282" elapsed="0.0" calls="1"/>
                    <stats:function name="taic:query" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [118:12]"/>
                    <stats:function name="taic:move-with-reindex" elapsed="0.003" calls="1" source="[unknown source] [14:35]"/>
                    <stats:function name="xmldb:store" elapsed="0.001" calls="1" source="/db/test-copy-xconf.xq [115:9]"/>
                    <stats:function name="collection" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [43:5]"/>
                    <stats:function name="xmldb:move" elapsed="0.001" calls="1" source="/db/test-copy-xconf.xq [113:9]"/>
                    <stats:function name="xmldb:reindex" elapsed="0.001" calls="1" source="/db/test-copy-xconf.xq [117:21]"/>
                    <stats:function name="range:eq" elapsed="0.0" calls="1" source="/db/test-copy-xconf.xq [43:46]"/>
                    <stats:index type="range" source="/db/test-copy-xconf.xq [43:46]" elapsed="0.0" calls="1" optimization="0"/>
                </stats:calls>
            </output>
        </testcase>
        <testcase name="store" class="taic:store"/>
        <testcase name="store-with-reindex" class="taic:store-with-reindex"/>
        <testcase name="touch-after-copy" class="taic:touch-after-copy"/>
        <testcase name="touch-after-move" class="taic:touch-after-move"/>
    </testsuite>
</testsuites>

line-o avatar Oct 23 '23 09:10 line-o

Copying or moving a collection configuration into the respective configuration collection will not apply the indexes immediately but storing the contents of that file will.

That is expected behaviour and is by design.

adamretter avatar Oct 23 '23 11:10 adamretter

Copying or moving a collection configuration into the respective configuration collection will not apply the indexes immediately but storing the contents of that file will.

That is expected behaviour and is by design.

What's the idea behind this?

line-o avatar Oct 23 '23 11:10 line-o

I would still expect that reindexing would then apply the new index configuration.

line-o avatar Oct 23 '23 12:10 line-o

In my case it makes sense to do this: My clients store a lot of data that is specific to their organisation and to ensure absolute separation, when we onboard a new client, our system creates a new "master" collection for their data. We create an index for that collection so that we can index each client separately. When we onboard a new client, they provide us with a lot of information to start with which we populate that "master" collection with. This includes sub collections and a number of documents. Every "master" organisation collection uses the same index pattern so we store a template collection.xconf file which I was hoping we could use the xmldb:copy-resource() to copy the file to the /db/system/config/db/apps/path-to-master-org-collection. Then after running xmldb:reindex('/db/apps/path-to-master-org-collection`) I would have hoped it would use that newly copied collection.xconf to reindex that collection...

I did manage to find a workaround though by using doc() on the template collection.xconf file and then xmldb:store() to copy and then store the collection.xconf file in the correct system collection... This process made the indexing work.

It was just unexpected behaviour. the fact that it didn't let me reindex the collection.xconf file after using xmldb:copy-resource() and the fact that that file still appeared in monex...

luckydem avatar Oct 23 '23 12:10 luckydem

In update 2 I can prove that storing an XConf will immediately trigger reindexing of the entire collection.

line-o avatar Oct 23 '23 18:10 line-o

From the feedback I gathered today:

Whenever a collection configuration changes, wether by storing copying or moving the configuration to a configuration collection, both of following must be true:

  1. xmldb:reindex() has to be called explicitly to apply new indexes to existing data
  2. new indexes are applied to new data immediately

Both statements above are not satisfied in exist-db 6.2.0 up until 7.0.0-SNAPSHOT

line-o avatar Oct 23 '23 18:10 line-o