postgrest icon indicating copy to clipboard operation
postgrest copied to clipboard

Combine array contains with pattern matching

Open ewan-escience opened this issue 9 months ago • 8 comments

Problem

A table could have a column that is an array, e.g.

CREATE TABLE my_table (
	some_array VARCHAR[]
);

To find all rows that contain the string myString in this array, I could call the endpoint /my_table?some_array=cs.{myString}.

However, I need to specify the string myString exactly, i.e. with the correct capitalisation and without any wildcards.

Solution

It would be helpful if I could do an ILIKE on the array, which could look like

/my_table?some_array=any.ilike.*ystr*

I am not aware of an existing solution like this yet.

ewan-escience avatar Feb 24 '25 11:02 ewan-escience

How would the SQL query for that look like?

wolfgangwalther avatar Feb 24 '25 11:02 wolfgangwalther

I don't think that could be a one-liner, due to the order of pattern and text in ILIKE and the order of expression and array for ANY.

I would hope for PostgREST to have a function under the hood that does this in a for loop or something comparable, and that the syntax I proposed above would invoke that function.

ewan-escience avatar Feb 24 '25 13:02 ewan-escience

How would the SQL query for that look like?

SELECT * FROM my_table t WHERE EXISTS (SELECT 1 FROM unnest(t.some_array) a(elem) WHERE elem ILIKE $1)

mkleczek avatar Feb 25 '25 19:02 mkleczek

/my_table?some_array=any.ilike.ystr

We already have any support but for the right value, not for the left value (the column):

/people?last_name=like(any).{O*,P*}" https://docs.postgrest.org/en/v12/references/api/tables_views.html#operator-modifiers

So I don't think we can use any or it would be confusing.

I'm thinking for this case we should add a new operator: cs_ilike ("contains ilike"). Also cs_like for completion.

SELECT * FROM my_table t WHERE EXISTS (SELECT 1 FROM unnest(t.some_array) a(elem) WHERE elem ILIKE $1)

Can an index help to speed up the above query?

Also are there other cases where unnesting the array is useful?

steve-chavez avatar Apr 09 '25 02:04 steve-chavez

SELECT * FROM my_table t WHERE EXISTS (SELECT 1 FROM unnest(t.some_array) a(elem) WHERE elem ILIKE $1)

Can an index help to speed up the above query?

I don't think so, I'm afraid.

Also are there other cases where unnesting the array is useful?

TBH I only find arrays useful mainly (only?) as function arguments (and unnesting is very useful in these cases).

mkleczek avatar Apr 09 '25 06:04 mkleczek

I don't think so, I'm afraid.

Looks like the only way is to speed this up is to concat the array elements and use an index like:

CREATE INDEX idx_some_array_trgm 
  ON my_table 
  USING gin ( array_to_string(some_array, ' ') gin_trgm_ops );

Then the query we generate should be like:

SELECT *
FROM my_table
WHERE array_to_string(some_array, ' ') ILIKE $1;

I think this is better than to generate a query that cannot be optimized. But at the same time, it's "opinionated", so not sure if it's the best way.

steve-chavez avatar Apr 09 '25 21:04 steve-chavez

Once we're dealing with operators which can't be supported by index, we should consider that "custom operators". We discussed a lot about that in #2028. I don't think we should provide something built-in for this case.

As we're talking "custom solutions" now, this...

SELECT *
FROM my_table
WHERE array_to_string(some_array, ' ') ILIKE $1;

... could be possible right now with a computed field. It's actually very similar to our fulltext filtering example: https://docs.postgrest.org/en/v12/references/api/computed_fields.html#horizontal-filtering-on-computed-fields

wolfgangwalther avatar Apr 10 '25 13:04 wolfgangwalther

I don't think we should provide something built-in for this case.

Agree, I think a how-to with the computed field would be good for this case.

steve-chavez avatar Apr 10 '25 19:04 steve-chavez