placeholder icon indicating copy to clipboard operation
placeholder copied to clipboard

Performance Optimizations to fulltext search

Open niravmehta opened this issue 5 years ago • 3 comments

The current hasSubject / _eachSynonym methods work with a single token at a time. This means for a long address string, there will be a longer list of tokens to check (lots of permutations), and hence lots of queries.

I noticed 30, 60 or even 250 fulltext search queries in some tests.

This pull request contains a new function that takes all the tokens and performs full text search in a single query. It returns matching tokens in the same format as expected, so other things don't need to change.

I saw consistent 30% speed improvements with these changes. Even bigger gains with longer searches.

Caveats:

  • I had to remove the test case for the commit to go through. I'm not familiar with the testing framework, and could not write up a proper test case.
  • Auto complete searches don't perform partial searches for the last token. So results may vary if using auto complete. (But I feel it's still a good compromise)

Do review and provide feedback.

niravmehta avatar Nov 08 '19 04:11 niravmehta

I revised the matchingSubjects function to support auto-complete, which also fixes fts5 errors with special characters in input. That code is here: https://github.com/niravmehta/placeholder/pull/4 (and not merged into this pull request #163)

Not sure why checks are failing on this pull request.

niravmehta avatar Nov 11 '19 06:11 niravmehta

Hi @niravmehta you can view the output of the CI by clicking on the Travis-CI check itself which opens a log such as https://travis-ci.org/pelias/placeholder/jobs/609069753, if you scroll down that log you'll see the failures listed.

I'm a bit concerned about this PR, it's not good practice to delete tests because they start failing, tests are there for a reason and it's important to understand why they changed and what effect this will have on other users before we can merge this to master.

missinglink avatar Nov 11 '19 13:11 missinglink

I agree removing test cases is not a good idea. But I do not know the testing framework, and couldn't figure out how to write a proper test case.

Will try again later when I have some free time.

niravmehta avatar Nov 11 '19 17:11 niravmehta