openfoodfacts-server Make suggest.pl efficient and exact by using a sqlite db

Make suggest.pl efficient and exact by using a sqlite db

Open alexgarel opened this issue 2 years ago • 3 comments

Following on https://github.com/openfoodfacts/openfoodfacts-dart/issues/465

I think a quick and easy way to be more exact and efficient would be following:

at startup, if needed (taxonomy sto more recent than db) build a sqlite database for the taxonomy
use this database for suggest.pl

On step 1. it is easy to make this "atomic" by building db with a different name, and moving file when done, if we want to offload this db creation to a minion.

DB structure would be a table of terms by language, each term, corresponding to normalized form (id) splitted on "-", attached to each entry corresponding to a taxonomy entry. The suggest would then be a simple SQL query with "like" operator on the terms. That could be scored and refined by perl code. (but it may be feasible to score directly in SQL)

I think this could also enable adding synonyms to the mix.

This solution is only targeted to suggest.

A better alternative could be to simply use elasticsearch, quickwit, or any real "full text" search index, as it may unlock more capabilities in the future.

Jun 13 '22 08:06 alexgarel

@stephanegigandet I'm not sure whether it's a good idea, but it is quite easy to implement.

@teolemon I think that having a good match on suggestion could really help users in adding the exact category to products.

Jun 13 '22 08:06 alexgarel

Related discussion: https://github.com/openfoodfacts/openfoodfacts-server/issues/5006

Jun 13 '22 12:06 stephanegigandet

Elastic Search

Jul 28 '22 09:07 teolemon

This issue has been open 90 days with no activity. Can you give it a little love by linking it to a parent issue, adding relevant labels and projets, creating a mockup if applicable, adding code pointers from https://github.com/openfoodfacts/openfoodfacts-server/blob/main/.github/labeler.yml, giving it a priority, editing the original issue to have a more comprehensive description… Thank you very much for your contribution to 🍊 Open Food Facts

Jan 05 '24 00:01 github-actions[bot]

We will use search-a-licious to implement that

Jan 18 '24 18:01 alexgarel

openfoodfacts-server openfoodfacts-server copied to clipboard

Make suggest.pl efficient and exact by using a sqlite db

openfoodfacts-server
openfoodfacts-server copied to clipboard