exist icon indicating copy to clipboard operation
exist copied to clipboard

[BUG] attribute filter not working properly on collection

Open PieterLamers opened this issue 4 years ago • 4 comments

Describe the bug When I try to filter a collection of JATS articles on a particular attribute value I get no results, except when I explicitly convert the attribute value to string.

Expected behavior I expect the attribute value to be cast to string in a value comparison.

To Reproduce example (1) does not return a count, example (2) does

(: 1 :)
let $coll as item()+ := collection('/db/data/journals.benjamins.com/')/article/front/article-meta/contrib-group/contrib/contrib-id[@contrib-id-type eq 'jb-contributor-id']
return count($coll)
(: 2 :)
let $coll as item()+ := collection('/db/data/journals.benjamins.com/')/article/front/article-meta/contrib-group/contrib/contrib-id[@contrib-id-type/string() eq 'jb-contributor-id']
return count($coll)

In addition, string(@contrib-type-id) eq 'jb-contributor-id' also works properly, and @contrib-id-type eq 'jb-contributor-id' does not. Representative data sent to @adamretter for reproduction.

Context (please always complete the following information):

  • OS: Windows
  • eXist-db version: 5.3.0
  • Java Version 1.8.0_282

PieterLamers avatar Jul 02 '21 09:07 PieterLamers

I have reduced the test case to:

empty(collection('/db/data/journals.benjamins.com/')//contrib-id[@contrib-id-type eq 'jb-contributor-id'])
,
empty(collection('/db/data/journals.benjamins.com/')//contrib-id[@contrib-id-type/string() eq 'jb-contributor-id'])

The result should be false, false, however the result is true, false.

If I replace the fn:collection call with fn:doc and use a document which is known to match the predicate, then the problem goes away and the result is false, false. So the issue is somehow related to the use of the fn:collection function. I suspect a bad optimisation path somewhere...

adamretter avatar Jul 02 '21 11:07 adamretter

I have found that the first query:

empty(collection('/db/data/journals.benjamins.com/')//contrib-id[@contrib-id-type eq 'jb-contributor-id'])

Is being rewritten by the optimizer to the form:

empty(collection('/db/data/journals.benjamins.com/')/descendant::contrib-id[range:eq(@contrib-id-type, "jb-contributor-id")])

i.e. it is attempting to resolve the predicate through a Range Index lookup.

The second query is not rewritten by the optimizer.

I note that there are no indexes defined in /db/system/config/db/data/journals.benjamins.com, so I will dig deeper and find out why it is trying to use the Range Index, when there is no such index available.

adamretter avatar Jul 02 '21 12:07 adamretter

Possibly related: https://github.com/eXist-db/exist/issues/3918

The similarities are that both issues:

  • involve the collection function
  • show that the range index is being invoked unexpectedly (though it's unclear from the report here if a range index has been defined; in 3918 no index was defined)

joewiz avatar Jul 02 '21 14:07 joewiz

@joewiz As per my comment above, there is no range index defined here.

adamretter avatar Jul 04 '21 21:07 adamretter