wger icon indicating copy to clipboard operation
wger copied to clipboard

Use postgres' full text search

Open rolandgeider opened this issue 4 years ago • 6 comments

Use case

Get better results for exercises and specially ingredients with full text search

Proposal

If using postgres, we should use its full text search capabilities so that we get better results and smooth out typos (search in exercises.api.views and nutrition.api.views). A short check of the connection engine should make easy to use the current filter if that's not the case. We should also document how to add an index to the db.

https://docs.djangoproject.com/en/3.2/ref/contrib/postgres/search/

from django.db import connection
if 'postgres' in connection.settings_dict['ENGINE']:
    ...
else:
    ...

rolandgeider avatar Apr 13 '21 16:04 rolandgeider

I recently worked on integrating full text search using postgres. I could take a look but I had a question. Is the project open to using database of multiple types? If it's going to be hosted using Postgres only then it would be better to generate a migration where we generate search vectors on fields (using SearchVectorField) that will remain indexed and the searches will operate against those fields. If not we can just use the default searchvector strategy to filter the text where we have to annotate to leverage the postgres full text search.

Alig1493 avatar Oct 09 '21 17:10 Alig1493

While the docker images and the docs use postgres, I wouldn't want to make it the only option (and e.g. during development I only use sqlite as it is very easy to work with). Does django support writing backend specific migrations? That would be probably the easiest

rolandgeider avatar Oct 09 '21 18:10 rolandgeider

Not sure about backend specific migrations but django does have post migration signals as far as I know. So based on this signal we can check the db type and run our own custom migrations. better yet add our own migration files which run some aspect of the code if it has a specific db backend type.

Alig1493 avatar Oct 10 '21 09:10 Alig1493

I've looked around a bit and found... little, but I think it's possible to detect, e.g. this answer on stackoverflow or we could probably do something like this and filter out non-postgres dbs

rolandgeider avatar Oct 10 '21 10:10 rolandgeider

At the end of the day however it's all down to having your custom operations that you either write a class for or have it invoked as a series for forward, backward functions with conditions checking for db type. Also if we do write a migrations for this we would also need to define a column where would store the tsvector data types (which need to be indexed as well for better performance) of the column against which we are going to query with our search parameters. If the app isn't primarily running a posgres engine it will be tricky to add model fields dynamically based on db engine types. I might not have all the answers regarding this but this what I know thus far.

Alig1493 avatar Oct 10 '21 16:10 Alig1493

https://pganalyze.com/blog/full-text-search-django-postgres

rolandgeider avatar Feb 12 '22 19:02 rolandgeider

Some prototyping can be found in the branch feature/pg-fulltext-search

rolandgeider avatar Dec 07 '22 20:12 rolandgeider

Is anyone working on this issue? If not, I want to work on it.

uzza1hossain avatar Dec 11 '22 10:12 uzza1hossain

Sure @uzza1hossain , go ahead

rolandgeider avatar Dec 11 '22 10:12 rolandgeider

@rolandgeider, you tried to implement using SearchVectorField, should I also use this approach, or SearchVector enough? Also, which column should I use as a search_column? I can't find any column without 'name' relevant to the Ingredient model (table).

For the Exercise model (table) 'name' and 'description' column is appropriate for the search_column.

uzza1hossain avatar Dec 11 '22 17:12 uzza1hossain

SearchVector might be enough, I was following the tutorial from pganalyze.com. For the ingredients we will just index the name, but should keep this flexible since we might add categories, brands or similar in the future. For exercises I'd say name, alias and description

rolandgeider avatar Dec 12 '22 14:12 rolandgeider

Hey, whats the status? if its open i can give it a try

vbsh123 avatar Jun 25 '23 18:06 vbsh123

Hi @vbsh123 I don't think much has happened since then. If you want to try, go ahead, this would be really useful!

rolandgeider avatar Jun 25 '23 18:06 rolandgeider

Hi, I would love to work on this issue. This would be my first as a contributor dev. I've had some experience working with PostgreSQL on the back-end of web apps. I look forward to giving it a shot, if that's alright.

eriklolson avatar Jul 22 '23 03:07 eriklolson

@eriklolson it's yours

rolandgeider avatar Jul 22 '23 09:07 rolandgeider

@eriklolson Hi! Any news? :)

rolandgeider avatar Sep 05 '23 13:09 rolandgeider

@rolandgeider Hi there! As a relative newbie to Django I've been doing some tutorials to familiarize myself with the ins and outs of the framework. I focused on those that use Postgres database. I've also read the Postgres full text search documentation and that for Django. I'm now nearly finished with a full text search tutorial. After this, I will be fully prepared to tackle this issue in a more effective manner.

I should also mention I was able to rebase the master thread onto the Postgres branch, after resolving some conflicts. I did this because the Postgres branch was very behind the master.

eriklolson avatar Sep 17 '23 02:09 eriklolson

awesome! If I can assist you in any way, just ping me

I also wanted to mention that a rebase would be needed, but you already did that :)

rolandgeider avatar Sep 17 '23 12:09 rolandgeider

There is a PR that simply uses a trigram search, which seems to return better results for our use case such as simple typos, different spellings (pullup / pull-up / pull up).

For posterity, there is this mostly working code for the "real thing" will full text search in this commit: https://github.com/wger-project/wger/pull/1594/commits/8b0e3870a37754ea2229853a83c64d1a67dab1ab

rolandgeider avatar Mar 02 '24 14:03 rolandgeider