oban
oban copied to clipboard
improve Oban Web full text search speed
Environment
- Oban.Pro v0.9.2
- Oban.Web v2.7.4
- PostgreSQL 13
- Elixir 1.13 and OTP 24
Current Behavior
With a large number of jobs (in our case 400k) we encounter the following query to have 10+ seconds execution time:
SELECT
o0."id",
o0."state",
o0."queue",
o0."worker",
o0."args",
o0."meta",
o0."tags",
o0."errors",
o0."attempt",
o0."attempted_by",
o0."max_attempts",
o0."priority",
o0."attempted_at",
o0."cancelled_at",
o0."completed_at",
o0."discarded_at",
o0."inserted_at",
o0."scheduled_at"
FROM
"public"."oban_jobs" AS o0
WHERE
(o0."state" = $1)
AND ($2
AND (((($3
OR jsonb_to_tsvector(o0."args",
$9) @@ websearch_to_tsquery($4) )
OR jsonb_to_tsvector(o0."meta",
$10) @@ websearch_to_tsquery($5) )
OR array_to_tsvector(o0."tags") @@ websearch_to_tsquery($6) )
OR (o0."worker" ILIKE $7)))
ORDER BY
o0."attempted_at" DESC
LIMIT
$8
Expected Behavior
This query should not take this long with appropriate indices.
The above query (I assume, I don't have logs to what $9 an d$10 above actually are) boils down to:
SELECT *
FROM OBAN_JOBS O0
WHERE (
JSONB_TO_TSVECTOR(O0."args", '["string", "numeric"]') @@ WEBSEARCH_TO_TSQUERY('16372')
OR JSONB_TO_TSVECTOR(O0."meta", '["string", "numeric"]') @@ WEBSEARCH_TO_TSQUERY('16372')
OR ARRAY_TO_TSVECTOR(O0."tags") @@ WEBSEARCH_TO_TSQUERY('16372')
);
In this example we are searching for '16372' and this simplified query currently takes about 8 seconds.
If I add
CREATE INDEX i0 ON OBAN_JOBS USING GIN (JSONB_TO_TSVECTOR('english', "args", '["string", "numeric"]'));
CREATE INDEX i1 ON OBAN_JOBS USING GIN (JSONB_TO_TSVECTOR('english', "meta", '["string", "numeric"]'));
CREATE INDEX i2 ON OBAN_JOBS USING GIN (ARRAY_TO_TSVECTOR("tags"));
and run the same query but with 'english' as regconfig (you have to specify this otherwise the expression is not immutable, which is required for indices),
SELECT *
FROM OBAN_JOBS O0
WHERE (
JSONB_TO_TSVECTOR('english', O0."args", '["string", "numeric"]') @@ WEBSEARCH_TO_TSQUERY('16372')
OR JSONB_TO_TSVECTOR('english', O0."meta", '["string", "numeric"]') @@ WEBSEARCH_TO_TSQUERY('16372')
OR ARRAY_TO_TSVECTOR(O0."tags") @@ WEBSEARCH_TO_TSQUERY('16372')
);
the query completes in 115ms, almost a 70x improvement 🙂
@dvic Great suggestion! Improving Web search speed out of the box would be fantastic, but I can't force that many extra indexes on apps by default.
What do you think about a guide on when and how to add the indexes? Perhaps a hint of some kind in Web when it detects a slow full-text search?
@dvic Great suggestion! Improving Web search speed out of the box would be fantastic, but I can't force that many extra indexes on apps by default.
What do you think about a guide on when and how to add the indexes?
A guide sounds like a good idea. The only problem is that you would still have to change all JSONB_TO_TSVECTOR
expressions to JSONB_TO_TSVECTOR('english', ...)
, otherwise we can't make an index for it. Not sure if it makes sense here, but since this is already passed as a positional argument, you could make this parameter configurable.
Perhaps a hint of some kind in Web when it detects a slow full-text search?
This would definitely be a good feature, in the beginning it could just show the execution time of the search somewhere next to the search input. Or only show it if it gets above a certain threshold? With a link to the docs how to improve the speed?