openfoodfacts-server
openfoodfacts-server copied to clipboard
Make suggest.pl efficient and exact by using a sqlite db
Following on https://github.com/openfoodfacts/openfoodfacts-dart/issues/465
I think a quick and easy way to be more exact and efficient would be following:
- at startup, if needed (taxonomy sto more recent than db) build a sqlite database for the taxonomy
- use this database for suggest.pl
On step 1. it is easy to make this "atomic" by building db with a different name, and moving file when done, if we want to offload this db creation to a minion.
DB structure would be a table of terms by language, each term, corresponding to normalized form (id) splitted on "-", attached to each entry corresponding to a taxonomy entry. The suggest would then be a simple SQL query with "like" operator on the terms. That could be scored and refined by perl code. (but it may be feasible to score directly in SQL)
I think this could also enable adding synonyms to the mix.
This solution is only targeted to suggest.
A better alternative could be to simply use elasticsearch, quickwit, or any real "full text" search index, as it may unlock more capabilities in the future.
@stephanegigandet I'm not sure whether it's a good idea, but it is quite easy to implement.
@teolemon I think that having a good match on suggestion could really help users in adding the exact category to products.
Related discussion: https://github.com/openfoodfacts/openfoodfacts-server/issues/5006
Elastic Search
This issue has been open 90 days with no activity. Can you give it a little love by linking it to a parent issue, adding relevant labels and projets, creating a mockup if applicable, adding code pointers from https://github.com/openfoodfacts/openfoodfacts-server/blob/main/.github/labeler.yml, giving it a priority, editing the original issue to have a more comprehensive description… Thank you very much for your contribution to 🍊 Open Food Facts
We will use search-a-licious to implement that