rum
rum copied to clipboard
RUM Index WITH rum_tsvector_addon_ops does not work for jsonb columns
Suppose you have this table
CREATE TABLE articles (id int, object_data JSONB);
If object_data has two fields (article & timeCreatedAsFromServer)
The following returns the error ERROR: attribute "((object_data -> 'timeCreatedAsFromServer')::bigint)" is not found in table
CREATE INDEX article_rum_idx ON articles
USING rum (to_tsvector('English', object_data->>'article') rum_tsvector_addon_ops, ((object_data -> 'timeCreatedAsFromServer')::bigint))
WITH (attach = '((object_data -> ''timeCreatedAsFromServer'')::bigint)', to = 'to_tsvector(''English'', object_data->>''article'')')
If you omit the WITH clause, it works as below!
CREATE INDEX article_rum_idx ON articles
USING rum (to_tsvector('English', object_data->>'article') rum_tsvector_addon_ops, ((object_data -> 'timeCreatedAsFromServer')::bigint))
Sorry, on running this
CREATE INDEX article_rum_idx ON articles
USING rum (to_tsvector('English', object_data->>'article') rum_tsvector_addon_ops, ((object_data -> 'timeCreatedAsFromServer')::bigint))
You get...
ERROR: additional information attribute "to_tsvector" is not found in index
Hello @ngigiwaithaka ,
Thank you for the issue! Unfortunately it isn't possible to attach exression's result as a additional information nowadays.
attach
and to
options in WITH
expression requires column names of a tables to be passed. That is why you get ERROR: attribute "((object_data -> 'timeCreatedAsFromServer')::bigint)" is not found in table.
Also it is necessary to pass attach
and to
options if you use rum_tsvector_addon_ops
opclass. If you don't pass it you get additional information attribute "to_tsvector" is not found in index (your second case).
@za-arthur Thanks for prompt reply.
Any plans to have this feature included anytime soon?
There was no such plan until today :) It depends not only from me. We will discuss implementation of this feature.
:-) my bad....
I have also posting something else I have come across, hope they wouldn't be too many for you today... :-)
Of course not! You are welcome to create issues.
This problem occurs not only with jsonb, but also in a simpler case - if you try to index not by the [additional] tsvector field, but by the text field, which is sad, since you actually have to duplicate a huge field, which doubles the size of an already rather big table. And the index for this redundant tsvector field is almost a threefold increase in the table as a result
Эта проблема встречается не только с jsonb, но и в более простом случае - если пытаться индексировать не по [дополнительному] полю tsvector, а по текстовому полю, что печально, так как приходится фактически дублировать огромное поле, что увеличивает размер и так немаленькой таблицы вдвое. А ещё индекс по этому избыточному полю tsvector - почти трёхкратное увеличение таблицы в итоге
create index on ... using rum (to_tsvector('lang', text_column) rum_tsvector_addon_ops, ts)
with (attach = 'ts', to = 'to_tsvector(''lang'', text_column)');