pg_search icon indicating copy to clipboard operation
pg_search copied to clipboard

associated_against with tsvector_column

Open cireficc opened this issue 7 years ago • 8 comments

I have tried every which way and cannot get this pg_search scope to work. The following does work:

Translation

class Translation < ActiveRecord::Base
  belongs_to  :source_phrase,
              class_name: "Phrase",
              foreign_key: "source_id"
  belongs_to  :destination_phrase,
              class_name: "Phrase",
              foreign_key: "destination_id"

  include PgSearch
  pg_search_scope :search_full_text, :associated_against =>
      {:source_phrase => :text}
end

Phrase

class Phrase < ActiveRecord::Base
  
  self.table_name = "new_phrases"

  has_many    :translations,
              class_name: "Translation",
              foreign_key: "source_id"
  has_many    :source_phrases,
              through: :translations
  has_many    :inverse_translations,
              class_name: "Translation",
              foreign_key: "destination_id"
  has_many    :destination_phrases,
              through: :inverse_translations
end

In the above I am not trying to use a ts_vector column on the associated table - I am just using the text field (which is obviously incredibly slow). If I try the following (in translation.rb):

pg_search_scope :search_full_text,
                  :associated_against => {
                      :source_phrase => :text,
                      :using => {
                          :tsearch => {
                              :tsvector_column => "tsv_text"
                           }
                       }
                  }

I get the error: NoMethodError: undefined method 'table_name' for nil:NilClass. Well that makes sense because using isn't part of Association. If I change it up slightly:

pg_search_scope :search_full_text,
                  :associated_against => {
                      :source_phrase => :text
                  },
                  :using => {
                      :tsearch => {
                          :tsvector_column => "tsv_text"
                      }
                  }

I get the error column translations.tsv_text does not exist, which makes sense, because the column I want is phrases.tsv_text. I can't specify the table name in the tsearch option, so I don't know what to do.

How do I use the :tsvector_column option for an associated table? I.e. I want to query phrases.tsv_text. Is this even possible with the current state of the library?

cireficc avatar May 20 '17 00:05 cireficc

This is currently a missing feature. The associated_against option unfortunately doesn't yet do anything special for a tsvector column. Presumably it would be possible to detect that the joined column is a tsvector and join the multiple rows together in a different way from how the current string aggregation works.

nertzy avatar Aug 01 '17 20:08 nertzy

@nertzy Any more thoughts about this? I would like to improve performance when searching against other models (also adding tsv_extended column so I can give users the option of whether they want simple or extended search), as currently it's taking around 800-1000ms to complete a search against related models. A SQL-only solution would be fine with me, but tsvectors are definitely not my strong suit.

arcreative avatar Apr 21 '19 16:04 arcreative

And to clarify, by SQL-only, I mean we can drop the against option and just have the SQL trigger do the heavy lifting.

arcreative avatar Apr 21 '19 16:04 arcreative

I did end up going this route and just using against: [] in the search scope. Works really well in my case. In case others don't want to bang their head against a wall trying to figure out how to craft the trigger, this is what ended up working for me:

CREATE OR REPLACE FUNCTION update_users_tsv() RETURNS trigger AS $$
BEGIN
  NEW.tsv := (
    SELECT 
      setweight(to_tsvector('simple', CONCAT(NEW.first_name, ' ', NEW.last_name)), 'A') || 
      setweight(to_tsvector('english', array_to_string(array_agg(DISTINCT skills.name), ' ')), 'B') ||
      setweight(to_tsvector('simple',  array_to_string(array_agg(DISTINCT cv_entries.client_name), ' ')), 'B') ||
      setweight(to_tsvector('english', array_to_string(array_agg(DISTINCT CONCAT(cv_entries.short_description, ' ', cv_entries.description)), ' ')), 'C')
    FROM users u
    LEFT JOIN skill_assignments sa ON sa.skillable_id = u.id AND sa.skillable_type = 'User'
    LEFT JOIN skills ON skills.id = sa.skill_id
    LEFT JOIN cv_entries ON cv_entries.user_id = u.id
    WHERE u.id = NEW.id
    GROUP BY u.id
  );
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER tsvectorupdate BEFORE INSERT OR UPDATE
ON users FOR EACH ROW EXECUTE PROCEDURE update_users_tsv();

Then the old UPDATE users SET id = id to update them all the first time.

arcreative avatar Apr 22 '19 19:04 arcreative

Did you tried to use merge query ?

# translation.rb

scope :search_full_text, -> (text) {
	joins(:source_phrase).merge(Phrase.search_full_text(text))
}
# phrase.rb

pg_search_scope :search_full_text,
   against: [:text],
   using: {
      tsearch: {
	tsvector_column: 'tsv_text'
      }
   }

vpiau avatar Aug 03 '20 12:08 vpiau

Would it be possible to update the readme with instructions on how to create a trigger to index associated records?

jmarsh24 avatar Aug 21 '22 03:08 jmarsh24

You can use my branch #504; we are running that branch in production while I wait for feedback on the PR.

mhenrixon avatar Aug 25 '23 09:08 mhenrixon

@mhenrixon Thanks, I ended up writing my own gem: https://rubygems.org/gems/pg_fulltext. It supports native websearch_to_tsquery, so it actually does a lot of things that this gem doesn't, and I've opted to manage associations via the record's tsv column for efficiency.

arcreative avatar Sep 18 '23 17:09 arcreative