orama icon indicating copy to clipboard operation
orama copied to clipboard

Extend query parameters by adding query clauses

Open micheleriva opened this issue 2 years ago • 7 comments

Is your feature request related to a problem? Please describe. As for now, Lyra is capable of indexing documents with searchable and non-searchable fields.

For instance, given the following schema, we index the following fields:

import { lyra } from '@nearfom/lyra';

const movieDB = new Lyra({
  schema: {
    // searchable fields
    title: 'string',
    director: 'string',
    plot: 'string',

    // non searchable
    year: 'number',
    isFavorite: 'boolean'
  }
});

Even though numbers and booleans are non-searchable fields, we should start using them for performing queries using the where keyword. An example could be:

const result = await movieDB.search({
  term: 'love',
  limit: 10,
  offset: 5,
  where: {
    year: { '>=': 1990 },
    isFavorite: true
  }
});

as a first iteration, we could go using AND only (so, in the above example, WHERE year >= 1990 AND isFavorite = true). In the future, we might want to support AND, OR, CONTAINS, etc.

micheleriva avatar Jul 16 '22 07:07 micheleriva

How can this be done? Do we first retrieve the documents and then apply the conditions or we are also indexing non-searchable fields?

DanieleFedeli avatar Jul 18 '22 07:07 DanieleFedeli

@DanieleFedeli I think we could store non-searchable parameters in reverse indexes and then access these properties in constant time during the search. I am currently working on the stemming issue, but I can try to illustrate my proposal with a simple, non-working, demonstrative PR

micheleriva avatar Jul 18 '22 08:07 micheleriva

TL;DR

6i3b18 (1)

micheleriva avatar Jul 18 '22 08:07 micheleriva

I am working on that, by now I have only created the type for the where clause.

DanieleFedeli avatar Jul 18 '22 15:07 DanieleFedeli

For now I am using these data structures for indexing non-searchable fields:

  • Boolean: HashMap -> Key: ${PropName}_${booleanValue} Value: Set which contains ids
  • Number: HashMap -> Key: ${PropName} value: another HashMap (Key -> numeric value, Value -> Set which contains ids)

Imho the boolean approach is ok but I personally don't like the numeric one. For exact numbers, it works well with just two reads in O(1) but when we have to do a range query it is a bit hacky.

Do you have any suggestions on that?

DanieleFedeli avatar Jul 19 '22 08:07 DanieleFedeli

My PR is almost ready, I will try to do some benchmark

DanieleFedeli avatar Jul 20 '22 07:07 DanieleFedeli

@DanieleFedeli quick follow-up: we're planning a plugin system, and we'll probably move the query clauses as a separate plugin; as for now I'll keep your PR opened as it might get merged as a plugin

micheleriva avatar Aug 02 '22 09:08 micheleriva

once where clause comes, facets filters will become available too?

nicksav avatar Dec 11 '22 22:12 nicksav

I am building this as a plugin https://github.com/DanieleFedeli/lyra-advanced-query-plugin For the moment it is pretty raw and It is not published to npm yet.

DanieleFedeli avatar Jan 08 '23 19:01 DanieleFedeli

Released in v.0.4.9

micheleriva avatar Feb 15 '23 13:02 micheleriva