posthog
posthog copied to clipboard
Groups query in decide endpoint is slow (and possibly incorrect)
In what situation are you experiencing subpar performance?
Decide endpoint queries groups for a user. This query is slow according to pganalyze:
https://app.pganalyze.com/databases/-675880137/queries/5344885692?t=24h
Example SQL
SELECT
(
("posthog_group"."group_properties" -> $ 1) IN (?)
AND "posthog_group"."group_key" IS NULL
AND "posthog_group"."group_type_index" = $ 3
) AS "flag_2434_condition_0",
(
("posthog_group"."group_properties" -> $ 4) IN (?)
AND "posthog_group"."group_key" IS NULL
AND "posthog_group"."group_type_index" = $ 6
) AS "flag_2434_condition_1",
(
"posthog_group"."group_key" IS NULL
AND "posthog_group"."group_type_index" = $ 7
) AS "flag_2434_condition_2",
(
"posthog_group"."group_key" IS NULL
AND "posthog_group"."group_type_index" = $ 8
) AS "flag_2252_condition_0",
(
("posthog_group"."group_properties" -> $ 9) IN (?)
AND "posthog_group"."group_key" IS NULL
AND "posthog_group"."group_type_index" = $ 11
) AS "flag_2252_condition_1",
(
"posthog_group"."group_key" IS NULL
AND "posthog_group"."group_type_index" = $ 12
) AS "flag_2251_condition_0"
FROM
"posthog_group"
WHERE
"posthog_group"."team_id" = $ 13
We're selecting all groups for a given team here and checking the condition for each. We should also be likely filtering on the groups query.
This in turn means we're likely returning incorrect results from the endpoint in addition to it being slow
Environment
- [x] PostHog Cloud
- [ ] self-hosted PostHog, version/commit: please provide
Additional context
cc @timgl who did the change in https://github.com/PostHog/posthog/pull/10804, @neilkakkar as feature owner
Thank you for your performance issue report – we want PostHog to go supersonic!
We should also be likely filtering on the groups query.
What does this mean?
The query should have a WHERE clause with a bunch of ORs for different filters - it does not right now. In essence, I think only the first group for a team is being checked.
I see what you mean, yes, this is wrong! Also seems to be doing a sequence scan vs using the index, which is the main slowness cause. Will fix, thanks :)