onegov-cloud
onegov-cloud copied to clipboard
Feature/ogc 508 replace elastic search by postgres text search
INITIAL FEEDBACK TO TEXT SEARCH IN POSTGRES - WORK IN PROGRESS
Feature/ogc 508 replace elastic search by postgres text search
core: Replace elastic search by postgres full text search
With this changes elastic search (es) is still in place and productive while we introduce the postgres full text search accessible using 'search_postgres' keyword in the url.
es url: http://localhost:8080/onegov_agency/bs/search?q=Arnold
psql url: http://localhost:8080/onegov_agency/bs/search_postgres?q=Arnold
hint: onegov-core upgrade
Main changes:
- Re-indexing
../onegov-cloud/src/onegov/search/cli.py - Upgrade script (uses the same code as re-indexing)
../onegov-cloud/src/onegov/search/upgrade.py - Extension of mixin class Searchable
.. /onegov-cloud/src/onegov/search/mixins.py - Adjusting all searchable models adding the fts index in postgres
Open Items:
- unit tests are not reworked yet
- language configuration
- improve ranking
Checklist
- [x] I have performed a self-review of my code
- [x] I considered adding a reviewer
- [x] I have added an upgrade hint such as data migration commands to be run
- [x] I made changes/features for both org and town6
- [ ] I have tested my code thoroughly by hand
- [x] I have tested database upgrades
- [ ] I have added tests for my changes/features
Codecov Report
Attention: 54 lines in your changes are missing coverage. Please review.
Comparison is base (
fa0d2b9) 88.03% compared to head (9e454e3) 35.09%. Report is 16 commits behind head on master.
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@ Coverage Diff @@
## master #918 +/- ##
===========================================
- Coverage 88.03% 35.09% -52.95%
===========================================
Files 1177 71 -1106
Lines 77378 4591 -72787
===========================================
- Hits 68118 1611 -66507
+ Misses 9260 2980 -6280
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@Tschuppi81 As a side note you probably should merge master again (or at least cherry pick 25d7d0e8436c537c8a314c0b742ebe0385fafed3) to skip the currently broken browser tests.
Escaping in postgres is done using quotes around column name "
@Daverball In the above commit I am fighting with more complex hybrid expressions. Most of them gather data from other tables and I could not figure out how to properly join the corresponding tables. Currently the 'reindexing' fails in CourseEvent.
Note: This commit to be reverted - just shows the join problem...
@Tschuppi81 This is going to be tricky in general and probably a bad idea in terms of performance anyways, you are going to be better off adding caches for the joined attributes and using @observes methods to keep the caches up to date.
Or we could give up on calculating the search index online, since we're still quite a ways off and just generate a dictionary offline of all the search values which then can be turned into a TSVECTOR directly if interpreted as JSONB. While somewhat inefficient, this would still be quicker than what we had before with elasticsearch, since we save ourselves from having to talk with the elasticsearch server. It would also get rid of the issues related to polymorphism, since SQLAlchemy is aware of what polymorphic identity we're talking about, while in postgres we would have to write CASE statements to distinguish the various versions in the same table.