ditto icon indicating copy to clipboard operation
ditto copied to clipboard

[things-search] "in" filter doesnt work on list anymore

Open bhat-ganesh opened this issue 3 years ago • 9 comments

We have been using "in" filter on a list in things-serach without issue on v1.5.1

Recently we moved to v2.0.1 and this is not working anymore.

Filter we have been using is in(features/devices/properties/ids,"b1221111ffff")

This is how our thing is structured

"features": {
  "devices": {
    "properties": {
      "ids": [
        "b1221111ffff",
        "b1221111fffe",
        "b1221111fffd"
      ]
    }
  },
  "info": {
    "properties": {
      "name": "my_device_name",
      "description": "list of my devices"
    }
  }
}

No change is mentioned in Swagger api document. in({property},{value},{value},...) (i.e. contains at least one of the values listed)

Is this a regression? or is it not supported on list anymore?

Thank you for your response.

bhat-ganesh avatar Jul 14 '21 20:07 bhat-ganesh

Hi @bhat-ganesh Thanks for reaching out.

To be honest, searching in JSON arrays was never supported in Ditto's search - we are surprised ourselves that id apparently did work with 1.5.1. The in({property},{value},{value},...) operator in the search is documented to search in a single scalar property (see also the example there).

So this is neither a regression nor a intentional drop of support of that feature. We can look into this at a later point as this has no priority for our use-cases. If you depend on that feature I suggest to stay at Ditto 1.5.1 or to provide a PullRequest which re-enables (and documents) that feature.

Might I ask: do you use Ditto in a commercial setup? If yes, out would be great to share that your company adopts Eclipse Ditto: https://iot.eclipse.org/adopters/?#iot.ditto

thjaeckle avatar Jul 15 '21 06:07 thjaeckle

We found something you could try: The "things-search" service has a environment variable named THINGS_SEARCH_UPDATER_STREAM_MAX_ARRAY_SIZE. This is by default set to 0. You can try to set the environment variable THINGS_SEARCH_UPDATER_STREAM_MAX_ARRAY_SIZE to e.g. 25.

Seems that the Ditto search will index JSON arrays with at most that configured amount of entries.

Would be great if you could provide feedback whether this works.

thjaeckle avatar Jul 15 '21 06:07 thjaeckle

From https://github.com/persvr/rql, seems like the correct relational operator should be "contains" (not supported by ditto today) vs "in".

  • contains(,<value | expression>) - Filters for objects where the specified property's value is an array and the array contains any value that equals the provided value or satisfies the provided expression.

qthuy avatar Jul 15 '21 11:07 qthuy

We tried "THINGS_SEARCH_UPDATER_STREAM_MAX_ARRAY_SIZE=25" on "things-search" service, but it did not help with the issue (i.e. no result returned from search API).

qthuy avatar Jul 15 '21 12:07 qthuy

@qthuy did you modify the thing so that an update in the search index is performed?

thjaeckle avatar Jul 15 '21 12:07 thjaeckle

I did not previously. Once I modify the thing, the query is working again.

Was this value/default recently changed? Any consideration when setting this value? If the array is bigger than this value, does this means that ditto will not search the whole array?

For your other question, we do plan to submit as an adopter once we go live later this year.

qthuy avatar Jul 15 '21 12:07 qthuy

Was this value/default recently changed? Any consideration when setting this value? If the array is bigger than this value, does this means that ditto will not search the whole array?

I think there was a bug before Ditto 2.0 that the configuration hierarchy was wrong, so that the configuration from the config file (which was already 0, disabling the indexing of arrays) was not applied correctly and the fallback from code (25) was used instead. This was fixed and so the 0 was configured effectively disabling indexing arrays.

This should not be set to a very high number as this could lead to high document sizes in the search index - but you have to try out and find out which value you would need as upper array size limit.

And yes, additional array entries would not be indexed if the array size is greater than the configured limit.

thjaeckle avatar Jul 15 '21 13:07 thjaeckle

Thanks @thjaeckle for your quick responses and explanation. Can I leave it to you to close this issue, if you think its just matter of usage and no change is needed on your side?

bhat-ganesh avatar Jul 16 '21 16:07 bhat-ganesh

@bhat-ganesh we'll let this issue open - as we should:

  • document that this is possible
  • find out limitations (there are probably some, that's why it is disabled by default now)
  • maybe think about another RQL operator for array operations which fits better (like @qthuy mentioned)

thjaeckle avatar Jul 19 '21 06:07 thjaeckle

ToDo for 3.0.0: Document that searching in arrays is possible and with which restrictions.

With 3.0.0 it is enabled by default.

thjaeckle avatar Sep 02 '22 06:09 thjaeckle