rum icon indicating copy to clipboard operation
rum copied to clipboard

RUM Index WITH rum_tsvector_addon_ops does not work for jsonb columns

Open ngigiwaithaka opened this issue 5 years ago • 7 comments

Suppose you have this table

CREATE TABLE articles (id int, object_data JSONB);

If object_data has two fields (article & timeCreatedAsFromServer)

The following returns the error ERROR: attribute "((object_data -> 'timeCreatedAsFromServer')::bigint)" is not found in table

CREATE INDEX article_rum_idx ON articles
  USING rum (to_tsvector('English', object_data->>'article') rum_tsvector_addon_ops,	 ((object_data -> 'timeCreatedAsFromServer')::bigint))
  WITH (attach = '((object_data -> ''timeCreatedAsFromServer'')::bigint)', to = 'to_tsvector(''English'', object_data->>''article'')')

If you omit the WITH clause, it works as below!

CREATE INDEX article_rum_idx ON articles
  USING rum (to_tsvector('English', object_data->>'article') rum_tsvector_addon_ops,	 ((object_data -> 'timeCreatedAsFromServer')::bigint))

ngigiwaithaka avatar May 22 '19 12:05 ngigiwaithaka

Sorry, on running this

CREATE INDEX article_rum_idx ON articles
  USING rum (to_tsvector('English', object_data->>'article') rum_tsvector_addon_ops, ((object_data -> 'timeCreatedAsFromServer')::bigint))

You get...

ERROR: additional information attribute "to_tsvector" is not found in index

ngigiwaithaka avatar May 22 '19 12:05 ngigiwaithaka

Hello @ngigiwaithaka ,

Thank you for the issue! Unfortunately it isn't possible to attach exression's result as a additional information nowadays.

attach and to options in WITH expression requires column names of a tables to be passed. That is why you get ERROR: attribute "((object_data -> 'timeCreatedAsFromServer')::bigint)" is not found in table.

Also it is necessary to pass attach and to options if you use rum_tsvector_addon_ops opclass. If you don't pass it you get additional information attribute "to_tsvector" is not found in index (your second case).

za-arthur avatar May 23 '19 11:05 za-arthur

@za-arthur Thanks for prompt reply.

Any plans to have this feature included anytime soon?

ngigiwaithaka avatar May 23 '19 11:05 ngigiwaithaka

There was no such plan until today :) It depends not only from me. We will discuss implementation of this feature.

za-arthur avatar May 23 '19 11:05 za-arthur

:-) my bad....

I have also posting something else I have come across, hope they wouldn't be too many for you today... :-)

ngigiwaithaka avatar May 23 '19 14:05 ngigiwaithaka

Of course not! You are welcome to create issues.

za-arthur avatar May 23 '19 14:05 za-arthur

This problem occurs not only with jsonb, but also in a simpler case - if you try to index not by the [additional] tsvector field, but by the text field, which is sad, since you actually have to duplicate a huge field, which doubles the size of an already rather big table. And the index for this redundant tsvector field is almost a threefold increase in the table as a result


Эта проблема встречается не только с jsonb, но и в более простом случае - если пытаться индексировать не по [дополнительному] полю tsvector, а по текстовому полю, что печально, так как приходится фактически дублировать огромное поле, что увеличивает размер и так немаленькой таблицы вдвое. А ещё индекс по этому избыточному полю tsvector - почти трёхкратное увеличение таблицы в итоге

create index on ... using rum (to_tsvector('lang', text_column) rum_tsvector_addon_ops, ts)
  with (attach = 'ts', to = 'to_tsvector(''lang'', text_column)');

zilzila avatar Oct 16 '23 08:10 zilzila