citus icon indicating copy to clipboard operation
citus copied to clipboard

Query broken after PG15.4 and PG14.9 minor upgrade

Open naisila opened this issue 2 years ago • 4 comments

https://app.circleci.com/jobs/github/citusdata/citus/1208304

SELECT user_id, array_length(events_table, 1)
FROM (
  SELECT user_id, array_agg(event ORDER BY time) AS events_table
  FROM (
    SELECT
    	u.user_id, e.event_type::text AS event, e.time
    FROM
    	users_table AS u,
        events_table AS e
    WHERE u.user_id = e.user_id AND
    		u.user_id IN
    		(
    			SELECT
    				user_id
    			FROM
    				users_table
    			WHERE value_2 >= 5
			    AND  EXISTS (SELECT user_id FROM events_table_local WHERE event_type > 1 AND event_type <= 3 AND value_3 > 1)
				AND  NOT EXISTS (SELECT user_id FROM events_table WHERE event_type > 3 AND event_type <= 4  AND value_3 > 1 AND user_id = users_table.user_id)
				LIMIT 5
    		)
  ) t
  GROUP BY user_id
) q
ORDER BY 2 DESC, 1;
DEBUG:  generating subplan XXX_1 for subquery SELECT user_id FROM subquery_local_tables.events_table_local WHERE ((event_type OPERATOR(pg_catalog.>) 1) AND (event_type OPERATOR(pg_catalog.<=) 3) AND (value_3 OPERATOR(pg_catalog.>) (1)::double precision))
-DEBUG:  push down of limit count: 5
-DEBUG:  generating subplan XXX_2 for subquery SELECT user_id FROM public.users_table WHERE ((value_2 OPERATOR(pg_catalog.>=) 5) AND (EXISTS (SELECT intermediate_result.user_id FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer))) AND (NOT (EXISTS (SELECT events_table.user_id FROM public.events_table WHERE ((events_table.event_type OPERATOR(pg_catalog.>) 3) AND (events_table.event_type OPERATOR(pg_catalog.<=) 4) AND (events_table.value_3 OPERATOR(pg_catalog.>) (1)::double precision) AND (events_table.user_id OPERATOR(pg_catalog.=) users_table.user_id)))))) LIMIT 5
-DEBUG:  Plan XXX query after replacing subqueries and CTEs: SELECT user_id, array_length(events_table, 1) AS array_length FROM (SELECT t.user_id, array_agg(t.event ORDER BY t."time") AS events_table FROM (SELECT u.user_id, (e.event_type)::text AS event, e."time" FROM public.users_table u, public.events_table e WHERE ((u.user_id OPERATOR(pg_catalog.=) e.user_id) AND (u.user_id OPERATOR(pg_catalog.=) ANY (SELECT intermediate_result.user_id FROM read_intermediate_result('XXX_2'::text, 'binary'::citus_copy_format) intermediate_result(user_id integer))))) t GROUP BY t.user_id) q ORDER BY (array_length(events_table, 1)) DESC, user_id
- user_id | array_length
----------------------------------------------------------------------
-       5 |          364
-(1 row)
-
+ERROR:  complex joins are only supported when all distributed tables are co-located and joined on their distribution columns

naisila avatar Aug 15 '23 11:08 naisila

broken by https://github.com/postgres/postgres/commit/b0e390e6d1d68b92e9983840941f8f6d9e083fe0

onderkalaci avatar Aug 15 '23 12:08 onderkalaci

@naisila or @onderkalaci can one of you report this regression an the mail thread related to that commit? https://www.postgresql.org/message-id/flat/CADrsxdbcN1vejBaf8a%2BQhrZY5PXL-04mCd4GDu6qm6FigDZd6Q%40mail.gmail.com

It seems they decided to not fix it because of ABI breakage, but there's still behavioural breakage because of it.

JelteF avatar Aug 15 '23 12:08 JelteF

@naisila or @onderkalaci can one of you report this regression an the mail thread related to that commit? https://www.postgresql.org/message-id/flat/CADrsxdbcN1vejBaf8a%2BQhrZY5PXL-04mCd4GDu6qm6FigDZd6Q%40mail.gmail.com

It seems they decided to not fix it because of ABI breakage, but there's still behavioural breakage because of it.

sure, trying to understand a bit better, then I'll send the mail.

onderkalaci avatar Aug 15 '23 13:08 onderkalaci

Hopefully this issue will be resolved in PG17 Re-allow FDWs and custom scan providers to replace joins with pseudoconstant quals

naisila avatar Sep 14 '23 13:09 naisila