pg_search
pg_search copied to clipboard
associated_against with tsvector_column
I have tried every which way and cannot get this pg_search scope to work. The following does work:
Translation
class Translation < ActiveRecord::Base
belongs_to :source_phrase,
class_name: "Phrase",
foreign_key: "source_id"
belongs_to :destination_phrase,
class_name: "Phrase",
foreign_key: "destination_id"
include PgSearch
pg_search_scope :search_full_text, :associated_against =>
{:source_phrase => :text}
end
Phrase
class Phrase < ActiveRecord::Base
self.table_name = "new_phrases"
has_many :translations,
class_name: "Translation",
foreign_key: "source_id"
has_many :source_phrases,
through: :translations
has_many :inverse_translations,
class_name: "Translation",
foreign_key: "destination_id"
has_many :destination_phrases,
through: :inverse_translations
end
In the above I am not trying to use a ts_vector column on the associated table - I am just using the text field (which is obviously incredibly slow). If I try the following (in translation.rb):
pg_search_scope :search_full_text,
:associated_against => {
:source_phrase => :text,
:using => {
:tsearch => {
:tsvector_column => "tsv_text"
}
}
}
I get the error: NoMethodError: undefined method 'table_name' for nil:NilClass
. Well that makes sense because using
isn't part of Association
. If I change it up slightly:
pg_search_scope :search_full_text,
:associated_against => {
:source_phrase => :text
},
:using => {
:tsearch => {
:tsvector_column => "tsv_text"
}
}
I get the error column translations.tsv_text does not exist
, which makes sense, because the column I want is phrases.tsv_text
. I can't specify the table name in the tsearch
option, so I don't know what to do.
How do I use the :tsvector_column
option for an associated table? I.e. I want to query phrases.tsv_text
. Is this even possible with the current state of the library?
This is currently a missing feature. The associated_against
option unfortunately doesn't yet do anything special for a tsvector column. Presumably it would be possible to detect that the joined column is a tsvector and join the multiple rows together in a different way from how the current string aggregation works.
@nertzy Any more thoughts about this? I would like to improve performance when searching against other models (also adding tsv_extended
column so I can give users the option of whether they want simple or extended search), as currently it's taking around 800-1000ms to complete a search against related models. A SQL-only solution would be fine with me, but tsvectors are definitely not my strong suit.
And to clarify, by SQL-only, I mean we can drop the against
option and just have the SQL trigger do the heavy lifting.
I did end up going this route and just using against: []
in the search scope. Works really well in my case. In case others don't want to bang their head against a wall trying to figure out how to craft the trigger, this is what ended up working for me:
CREATE OR REPLACE FUNCTION update_users_tsv() RETURNS trigger AS $$
BEGIN
NEW.tsv := (
SELECT
setweight(to_tsvector('simple', CONCAT(NEW.first_name, ' ', NEW.last_name)), 'A') ||
setweight(to_tsvector('english', array_to_string(array_agg(DISTINCT skills.name), ' ')), 'B') ||
setweight(to_tsvector('simple', array_to_string(array_agg(DISTINCT cv_entries.client_name), ' ')), 'B') ||
setweight(to_tsvector('english', array_to_string(array_agg(DISTINCT CONCAT(cv_entries.short_description, ' ', cv_entries.description)), ' ')), 'C')
FROM users u
LEFT JOIN skill_assignments sa ON sa.skillable_id = u.id AND sa.skillable_type = 'User'
LEFT JOIN skills ON skills.id = sa.skill_id
LEFT JOIN cv_entries ON cv_entries.user_id = u.id
WHERE u.id = NEW.id
GROUP BY u.id
);
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER tsvectorupdate BEFORE INSERT OR UPDATE
ON users FOR EACH ROW EXECUTE PROCEDURE update_users_tsv();
Then the old UPDATE users SET id = id
to update them all the first time.
Did you tried to use merge query ?
# translation.rb
scope :search_full_text, -> (text) {
joins(:source_phrase).merge(Phrase.search_full_text(text))
}
# phrase.rb
pg_search_scope :search_full_text,
against: [:text],
using: {
tsearch: {
tsvector_column: 'tsv_text'
}
}
Would it be possible to update the readme with instructions on how to create a trigger to index associated records?
You can use my branch #504; we are running that branch in production while I wait for feedback on the PR.
@mhenrixon Thanks, I ended up writing my own gem: https://rubygems.org/gems/pg_fulltext. It supports native websearch_to_tsquery, so it actually does a lot of things that this gem doesn't, and I've opted to manage associations via the record's tsv column for efficiency.