rum icon indicating copy to clipboard operation
rum copied to clipboard

RUM Index slow while referencing JSONB column

Open ngigiwaithaka opened this issue 5 years ago • 0 comments

Suppose you have this table

> CREATE TABLE news_items (id int, object_data JSONB, weighted_tsv tsvector, time_created_from_server bigint);

Also, supposing the object_data jsonb has the following fields: title, article, author and other fields(10)...

I have created the rum index article_rum_idx as follows...

> CREATE INDEX news_item_rum_english_weighted_tsv_idx ON news_items
>   USING rum ( weighted_tsv rum_tsvector_addon_ops,	time_created_as_from_server)
>     WITH (attach = 'time_created_as_from_server', to = 'weighted_tsv');

When I query while referencing object_data and order by rank as follows:

> EXPLAIN ANALYZE
> select object_data,  weighted_tsv <=> to_tsquery('English', 'Uhuru') rank            
> from news_items 
> where
> weighted_tsv @@ websearch_to_tsquery('English', 'Uhuru')
> ORDER BY rank
> limit 10;

Query executes in 418ms

Screenshot from 2019-05-23 17-32-00

When I query while referncing object_data->'title' and order by rank as follows:

> EXPLAIN ANALYZE
> select object_data->'title',  weighted_tsv <=> to_tsquery('English', 'Uhuru') rank            
> from news_items 
> where
> weighted_tsv @@ websearch_to_tsquery('English', 'Uhuru')
> ORDER BY rank
> limit 10;

Screenshot from 2019-05-23 17-31-27

Query executes in 1.7s

That is almost 5x slower on exact same query and is consistent amongst different search terms.

Executing the same query without the ORDER BY executes much faster in around 3ms and there is not much difference between the two. Screenshot from 2019-05-23 17-41-55

In this case it uses an Index Scan as opposed to the Bitmap Scan while its ordering by Rank and I am wondering how I could structure the query so that it always uses the must faster Index Scan.

Regards.

ngigiwaithaka avatar May 23 '19 14:05 ngigiwaithaka