openstreetmap-website icon indicating copy to clipboard operation
openstreetmap-website copied to clipboard

Implement Balanced Full-Text Search for Diary Entries

Open kcne opened this issue 5 months ago • 4 comments

This PR adds search functionality to the diary entries page, addressing the need for users to find entries by title and body #3289

Key Changes

  • Model Updates:

    • Integrated the pg_search gem to enable full-text search on diary entries with relevance ranking.
    • Added a searchable column using a tsvector for efficient searching and created an index for performance optimization.
  • Controller Enhancements:

    • Modified the index action in DiaryEntriesController to support search queries, allowing users to filter results by keywords in titles and bodies, with optional language filtering.
  • Testing:

    • Added tests to verify the search functionality, including keyword search, language filtering, and basic edge cases.

Context

Previous methods explored for implementing search functionality, such as the PostgreSQL LIKE operator and pg_trgm search, were either too fast but lacked relevance ranking or too slow to be practical, especially with large datasets like the diary entries (600,000+ records). For instance, LIKE provided speed but no relevance, while pg_trgm could take over 40 seconds to run on a dataset of this size, making it unfeasible. The pg_search gem, utilizing tsearch, offers a balance between relevance and performance, though it required additional database migrations and optimizations. Despite the need for these migrations, this approach serves as a "golden middle," providing a reasonable trade-off between speed and search result quality IMHO.

Commit Summary

  • Add pg_search gem: Introduced pg_search for full-text search capabilities.
  • Add searchable column to diary entries migration: Created a tsvector column for storing precomputed search data.
  • Add index to searchable diary entries migration: Indexed the searchable column for performance improvements.
  • Add search query logic to diary entry model and controller: Integrated search functionality into the model and controller.
  • Add search functionality tests to DiaryEntriesController: Added basic tests to verify the new functionality.

Work in Progress

This is a draft PR. I want to confirm if I'm heading in the right direction. If the approach is approved, I'll add more tests for better coverage and handling of edge cases. Any comments and recommendations welcome.

kcne avatar Sep 03 '24 18:09 kcne