sonic icon indicating copy to clipboard operation
sonic copied to clipboard

Any plan to add more data types and custom scoring over several fields?

Open luikore opened this issue 5 years ago • 4 comments

We'd like to switch from ES to Sonic. But we already have many custom scoring functions over different document attributes, it seems not possible on sonic yet.

Is there any plan to add such abilities? Anything we can do to help?

For example, a common use case is a "search hot posts". Hotness is a numeric value stored as a document attribute. If there can be a way to utilize extra numeric attributres to tweak search results, I think Sonic will become a much more powerful search engine.

luikore avatar Apr 28 '19 03:04 luikore

Hi! As I’d like to keep things simple with Sonic, I don’t plan it to have custom scoring functions.

What would those be in your case specifically?

valeriansaliou avatar Apr 28 '19 05:04 valeriansaliou

Hi,

I understand the question behind this issue : let's suppose that I have a title, an author field and maybe a slug field, how can I integrate this data into Sonic? and how to make the title more important than the field of authors?

poupryc avatar Apr 28 '19 21:04 poupryc

Btw, as you mentioned "data types", I though you also meant indexing collections and buckets by something other than Unicode strings, e.g. arbitrary byte strings. Could be useful too.

@HelloEdit I believe you would need to apply additional scoring after retrieving the documents' ids as the Sonic index does not store the documents themselves.

vilunov avatar May 05 '19 08:05 vilunov

One strategy would be to ingest each "field" using the bucket strategy in the same index: Something like this: products > default > product:title:ps5product-id > "playstation 5" products > default > product:desc:ps5:product-id > "this console isnt released yeat, you can buy a playstation 4 as of now" products > default > product:comments:ps5product-id:comment-id > "i just pre ordered this, but I have a playstation 3" products > default > product:comments:ps5product-id:comment-id > "i love playstation" products > default > product:comment:xbox1product-id:comment-id > "i think xbox is better than playstation"

So when you search default for "playstation" you will get back this objects:

product:title:ps5product-id product:desc:ps5:product-id product:comments:ps5product-id:comment-id product:comments:ps5product-id:comment-id product:comment:xbox1product-id:comment-id

Then your API that proxies Sonic or your front-end can do the scoring/sorting based on the object buckets before retrieving the actual objects from your database.

You could also mix score weights in this strategy, for example, on the object/bucket start it with 10:product:title:ps5product-id or 1:product:comments:ps5product-id:comment-id so when you get the results from Sonic, you can just order them in desc order and then retrieve them from the database.

andersonsantos avatar May 06 '19 12:05 andersonsantos