exist
exist copied to clipboard
Frequent re-indexing required
What is the problem
We get unexpected results from xqueries, e.g. documents or fragments thereof are not returned in xquery results when they should be.
What did you expect
We expect the existing documents fragment's to be returned by the xquery
Describe how to reproduce or add a test
I am not sure how to recreate the issue. The 'fix' is to reindex the collection.
After upgrading from 2.2 eXist to 3.6 on December 3rd, 2017 I've found that documents are becoming 'un-indexed'. That is, queries work correctly for a while, but eventually they do not return expected results until I re-index the collection that contains the documents.
This is possibly related to #1648 or #1691
If there is a corruption issue when indexes are updated during regular document updates, then as documents are updated they may randomly be removed from the index, or subsequent queries may return the wrong results.
So far we've found that re-indexing the collection fixes the issue, for about 24 hours at most depending on update/insert volume.
Under 2.1 and older releases we had similar symptoms - queries failed to return expected results, but re-indexing "fixed" the issue. In the past we had setup a cronjob to re-index our most active collections. I think this was fixed in 2.2, but I cannot do that this time because these collections are large and re-indexing takes too long.
The most recent re-index was needed for a collection that contained approximately 930 documents.
the exist.log file doesn't show any exceptions during re-indexing, nor when querying a collection that is not working correctly.
Context information
We're using the docker image evolvedbinary/exist-db tag eXist-3.6.1 (strangely monex doesn't display version information..)
the related collection.xconf is shown below
<collection xmlns="http://exist-db.org/collection-config/1.0">
<index>
<fulltext default="none" attributes="false" alphanum="false"/>
<range>
<create qname="invoicenumber" type="xs:string"/>
<create qname="due_date" type="xs:string"/>
<create qname="carrierid" type="xs:string"/>
<create qname="superseded" type="xs:string"/>
<create qname="supersedes" type="xs:string"/>
<create qname="vendorid" type="xs:float"/>
<create qname="state" type="xs:string"/>
<create qname="trackingnumber" type="xs:string"/>
<create qname="flags" type="xs:string"/>
<create qname="hash" type="xs:string"/>
<create qname="key" type="xs:string"/>
<create qname="normalized_serviceid" type="xs:string"/>
<create qname="shipmentid" type="xs:string"/>
</range>
</index>
</collection>
just reporting in that I we continue to have a problem where documents aren't being returned from xqueries until the collection is re-indexed. The same collection referred to above requires daily re-indexing, and over the past 2 months I've had to re-index different collections (each with fewer documents).. e.g. today I have a collection that contains about 25 documents. Two documents added in the past 24 hours were not being selected by xquery until I re-indexed the collection just now.
however other documents that had been added in the past 24 hours were being returned by xquery. It seems like a random issue..
fyi, we have experienced 3 database lockups in the past 8 days running 3.6.1 in docker, with eXist data stored on a host-mounted volume.
each time xquery and xmlrpc requests would have their tcp socket connect, but the server would not respond. The exist/apps page would not load.
docker restart would cause eXist to be killed, restart and then recover the database.
today it hung, but restarting this time failed due to fatal errors and corruption. Fortunately we had a backup only a few hours old.
We've now moved to 4.0.0 docker image. I will report back if we continue to see index issues with this version. I will open a new ticket if we experience another hang.
I am now opening a ticket related to upgrade problem...
We have experienced another 'de-indexing' issue
eXist Version: 4.0.0 eXist Build: 201802141635
We have some documents that are kept in-sync with relational database records. e.g. for 1 to 50 documents it: 1 reads document from eXist 2 update relational database record 3 update eXist
Then ends with: 4 query then update a summary document in eXist 5 relational database commit
We've found that when eXist has an indexing issue, step 4 returns invalid data (e.g. documents known to exist in step 1 aren't returned in step 4) (the queries are not exactly the same..)
This causes the code to fail, an exception is raised and the relational database transaction is rolled back, and all but the summary documents remain modified in eXist, so the relational database and eXist are the out of sync.
To mitigate this problem, we changed the code to also run step 4 before making any changes to documents (that is, before step 1). The theory being that if the eXist index was bad, we'd catch that before making any changes.
Unfortunately we've never had it fail on the first 'step 4 summary', it's always failed on the last (real) summary step. And that failure has happened today with version 4.0.0
That suggests that the eXist indexing issue is related to the documents that are being updated. That is, the documents probably aren't being dropped from the index because some other unrelated document was changed.
not sure if this is helpful...
Just chiming in, we continue to have indexing issues daily. I will not post more on this issue unless you need more information. Thanks
Some more debug info. Attached zip file shows a sample document. We have perhaps 400 of these in a single collection
The following xquery is run against that collection
<services python.type="list">{
let $base := '/db/orgs/sfiusa/payables/carrier/APEX/41f27b32d69234c1f651f7b678c2ea21'
for $doc in collection($base)//service
where not(fn:exists($doc/superseded))
return <service>
{$doc/state}
{$doc/key}
{$doc/pay_summary/amount_to_pay}
{$doc/pay_summary/outstanding_pay_amount}
{$doc/vendor_data/total_due}
{$doc/invoicenumber}
{$doc/due_date}
</service>
}</services>
the query returns one
<service>
<state>payable</state>
<key>1140b85e293d1f5bd16fad9f8a4f1026</key>
<amount_to_pay python.type="decimal.Decimal">59.75</amount_to_pay>
<outstanding_pay_amount python.type="decimal.Decimal">59.75</outstanding_pay_amount>
<total_due python.type="decimal.Decimal">59.75</total_due>
<invoicenumber>01241497847</invoicenumber>
<due_date day="9" month="3" python.type="datetime.date" year="2018">2018-03-09</due_date>
</service>
<service>
<state>payable</state>
<amount_to_pay python.type="decimal.Decimal">50.59</amount_to_pay>
<outstanding_pay_amount python.type="decimal.Decimal">50.59</outstanding_pay_amount>
<total_due python.type="decimal.Decimal">50.59</total_due>
<invoicenumber>01248480713</invoicenumber>
<due_date day="28" month="2" python.type="datetime.date" year="2018">2018-02-28</due_date>
</service>
<service>
<state>paid</state>
<key>256d8ed07c2fb4fa305cb2968c4fcc06</key>
<outstanding_pay_amount python.type="decimal.Decimal">0</outstanding_pay_amount>
<total_due python.type="decimal.Decimal">55.91</total_due>
<invoicenumber>01247016880</invoicenumber>
<due_date day="26" month="4" python.type="datetime.date" year="2017">2017-04-26</due_date>
</service>
note that the 2nd service element, which matches the attached document, is missing the key element that should contain the value df9eace047cf17997eb6d9a679fb0934
After I re-index the collection and run the query again, I get the expected result
<service>
<state>payable</state>
<key>df9eace047cf17997eb6d9a679fb0934</key>
<amount_to_pay python.type="decimal.Decimal">50.59</amount_to_pay>
<outstanding_pay_amount python.type="decimal.Decimal">50.59</outstanding_pay_amount>
<total_due python.type="decimal.Decimal">50.59</total_due>
<invoicenumber>01248480713</invoicenumber>
<due_date day="28" month="2" python.type="datetime.date" year="2018">2018-02-28</due_date>
</service>
@bkcsfi I understand that @wolfgangmm is trialling a patch at the moment that should fix this.
I realize that I have a similar, or even same, issue in my production database; I indeed re-index on a daily basis, to prevent erroneous entries in the result. Somehow I never found a clue...
I managed to track down and fix one indexing issue in #1763, which leads to somewhat similar symptoms. It only applies to elements indexed by and queried through a range index though. Unfortunately your sample query above does not use any XPath filter and thus no range index, which makes me suspect the underlying cause is not in the range index, but in the structural index.
Does the issue really manifest if you run the simple query above which just lists the contents of some elements?
Unfortunately, the xquery shown above is the actual query we are running in production.
Ok, would it be possible to provide me a dump of the eXist data dir (just dom.dbx, symbols.dbx, collections.dbx and structure.dbx) of an instance where the issue just became noted? Ideally I would also need a query demonstrating the wrong results. I could then at least identify the underlying issue.
I would also need to know how documents are updated (complete replacement, XQuery update ...)?
Thank you for looking into this issue. I've asked my manager about sending you the entire database and was told to ask if there was some other way we could help debug this, as there is a lot of confidential information in the database. If we copied the database, then deleted most everything in it, do you think the undeleted 'bad' documents would still be 'bad'?
With respect to your other questions about how documents are updated. The general lifecycle is as follows
- create a collection
- insert 1 to n documents into this collection using PUT
- execute the xquery (pasted above), in the client (Python) summarize and group the returned data
- create a new 'summary.xml' file in the collection using PUT
then as time goes by, follow these steps:
- to update an existing document, read it using xquery, alter the xml representation in the client and PUT the document back into the collection (overwrites existing documents using HTTP PUT)
- run the xquery (pasted above), in the client summarize and group the returned data
- xupdate the existing summary.xml file
see https://github.com/eXist-db/exist/issues/1720#issuecomment-368161619 for info on how we changed steps 6 & 7 to also run before step 5 in an effort to 'catch' the issue before updating the document in step 5.
So far, we've never caught the issue before updating documents. The theory was that perhaps changes made to other collections were interfering with documents in the current collection.
Regarding the xquery/xupdate run in step 6, I looked at it today and it seems to have a copy-paste error
<xupdate:modifications xmlns:xupdate="http://www.xmldb.org/xupdate" version="1.0">
<xupdate:remove select="/batch_summary/due_dates" />
<xupdate:remove select="/batch_summary/due_dates" />
<xupdate:remove select="/batch_summary/due_dates" />
<xupdate:append select="/batch_summary">
%(due_dates_xml)s
</xupdate:append>
</xupdate:modifications>
I do not know why we're removing the same element 3 times instead of once. Revision control commit message from 2008 says "try to remove due_dates if its in there more than once".
Is it possible that this xupdate on summary.xml is messing around with the structural index for other documents in this collection?
Finally, we have a collection with one bad document right now. There are 1103 documents in this collection.
I've run this query (same as above) adding last-modified()
<services python.type="list">{
let $base := '/db/orgs/sfiusa/payables/carrier/OtherLTL/4bb11e8f2b1c7afd63cfe3305ee23390'
for $doc in collection($base)//service
order by xmldb:last-modified($base, util:document-name($doc))
where not(fn:exists($doc/superseded))
return <service>
{$doc/state}
{$doc/key}
{$doc/pay_summary/amount_to_pay}
{$doc/pay_summary/outstanding_pay_amount}
{$doc/vendor_data/total_due}
{$doc/invoicenumber}
{$doc/due_date}
<last_modified>{xmldb:last-modified($base, util:document-name($doc))}"
</service>
}</services>
The output ends with this:
...
<service>
<state>payable</state>
<key>149505c8af74ff973dd711fc93734a04</key>
<amount_to_pay python.type="decimal.Decimal">289.05</amount_to_pay>
<outstanding_pay_amount python.type="decimal.Decimal">289.05</outstanding_pay_amount>
<total_due python.type="decimal.Decimal">289.05</total_due>
<invoicenumber>P1128115</invoicenumber>
<due_date day="22" month="3" python.type="datetime.date" year="2018">2018-03-22</due_date>
<last_modified>2018-02-28T15:06:48.679-05:00</last_modified>
</service>
<service>
<state>payable</state>
<amount_to_pay python.type="decimal.Decimal">5000.00</amount_to_pay>
<outstanding_pay_amount python.type="decimal.Decimal">5000.00</outstanding_pay_amount>
<invoicenumber>R105352</invoicenumber>
<due_date day="17" month="3" python.type="datetime.date" year="2018">2018-03-17</due_date>
<last_modified>2018-03-02T15:20:34.527-05:00</last_modified>
</service>
<service>
<state>payable</state>
<key>cf61a3a025f1e2b190b51e68499c8f40</key>
<amount_to_pay python.type="decimal.Decimal">338.75</amount_to_pay>
<outstanding_pay_amount python.type="decimal.Decimal">338.75</outstanding_pay_amount>
<invoicenumber>14572978923</invoicenumber>
<due_date day="18" month="3" python.type="datetime.date" year="2018">2018-03-18</due_date>
<last_modified>2018-03-05T15:08:37.55-05:00</last_modified>
</service>
</services>
When I first ran this, the last service element was invoice R105352, (currently the 2nd element). This is the document whose structural index is broken and it's missing <key/>
Since that time someone has added a new document, but the summary.xml file would have failed to recalculate due to the missing key element.
Our log files suggest that document R105352 was inserted (PUT) into eXist only once. That is, we did not overwrite an existing document with the same name.
I had the issues purely where HTTP methods PUT and DELETE were used [at night], and of course queries that use the index [daytime]
If you cannot provide the data (happy to sign a NDA if needed), the only other option would be for us to actually reproduce the issue with a test. Have you tried simulating the exact sequence of actions in a test already? If we had a test I'm sure we can fix the issue.
It is unlikely that the XUpdate on summary.xml
is messing up the structural index.
If you modify your query above as follows, do you still see the missing <key
>?
...
for $doc in collection($base)//service
let $doc := util:expand($doc)
...
I ran this xquery
<services python.type="list">{
let $base := '/db/orgs/sfiusa/payables/carrier/OtherLTL/4bb11e8f2b1c7afd63cfe3305ee23390'
for $xdoc in collection($base)//service
let $doc := util:expand($xdoc)
order by xmldb:last-modified($base, util:document-name($xdoc))
where not(fn:exists($doc/superseded))
return <service>
{$doc/state}
{$doc/key}
{$doc/pay_summary/amount_to_pay}
{$doc/pay_summary/outstanding_pay_amount}
{$doc/vendor_data/total_due}
{$doc/invoicenumber}
{$doc/due_date}
<last_modified>{xmldb:last-modified($base, util:document-name($xdoc))}"</last_modified>
</service>
}</services>
now I do get the expected key value
<service>
<state>payable</state>
<key>52927d153458a265622ab36b5e62ec4f</key>
<amount_to_pay python.type="decimal.Decimal">5000.00</amount_to_pay>
<outstanding_pay_amount python.type="decimal.Decimal">5000.00</outstanding_pay_amount>
<total_due python.type="decimal.Decimal">5000.00</total_due>
<invoicenumber>R105352</invoicenumber>
<due_date day="17" month="3" python.type="datetime.date" year="2018">2018-03-17</due_date>
<last_modified>2018-03-02T15:20:34.527-05:00"</last_modified>
</service>
as a check, w/o the 'expand' function, I do not get the key value
<service>
<state>payable</state>
<amount_to_pay python.type="decimal.Decimal">5000.00</amount_to_pay>
<outstanding_pay_amount python.type="decimal.Decimal">5000.00</outstanding_pay_amount>
<invoicenumber>R105352</invoicenumber>
<due_date day="17" month="3" python.type="datetime.date" year="2018">2018-03-17</due_date>
<last_modified>2018-03-02T15:20:34.527-05:00"</last_modified>
</service>
Ok, thanks. This proofs the structural index is indeed the culprit. Now we would just need to find a way to simulate the issue.
We are using eXist-db version 3.2.0. We are facing this same issue in production. In our case the XML gets updated quite heavily. We are facing this almost everyday. I will try to see if we are able to simulate the issue in a way for you to replicate and test
Should be fixed with https://github.com/eXist-db/exist/pull/1763
Changes are in the nighty build http://static.adamretter.org.uk/exist-nightly/ and current develop. Will be released in version v4.1.0 (twee weeks)
We upgraded to evolvedbinary/exist-db:eXist-4.1.0 on April 22nd
The number of 'indexing issues' has greatly decreased. However we still continue to have the same symptoms, at a reduced rate.
Today we've had about the 7th issue where a simple xquery returned no results where at least one was expected and where re-indexing the collection resolved the problem.
How much difference is er between the evolved-binary 'release' and the official release? Which GIT hash/revision are you actually using?
@dizzzz should be the same as master (ie official release) the commit is 919514c
hello,
just reporting that this issue still exists in 4.4.0 (official docker image)
I upgraded from 4.1.0 following instructions (install, run, shutdown, copy data directory) then re-indexed collections..
but we continue to have this issue.
just taking stabs in the dark here, but have you increased the logging level to see if there are some non warnings that might give us a clue? If it is related to the structural index, maybe adding xml:id to the data might help us with debugging?
We're using eXist 5.1.0
and also encountered this problem. Are there any attempts how to tackle this problem beside frequent reindexing?
I am afraid that there is not enough information provided in this issue for a developer to act upon. We would need to be provided with a reproducible test case, this could simply be a list of exact steps to take from a clean install of eXist-db until when the problem occurs.
If sufficient information is provided, then it is much more likely that this will be investigated.