Query with window function, hstore condition and outer join causes segmentation fault
Running the following SQL query always causes SEGFAULT in Citus 13.1.0/13.1.1 with PG17.5. Tthis issue seems similar to #7705, but SEGFAULT with both of HSTORE condition and OUTER JOIN.
Testing environment:
- Rocky Linux 9.5
- Postgres 17.5 (src)
- citus 13.1.0, 13.1.1
- 1 coordinator, 3 workers
Reproduce queries:
CREATE SCHEMA segfault_reproduce;
SET search_path TO segfault_reproduce,"$user",public;
DROP TABLE IF EXISTS segfault_reproduce.users CASCADE;
CREATE TABLE segfault_reproduce.users (
tenant_id VARCHAR(10) NOT NULL,
session_id TEXT NOT NULL,
customer_id TEXT,
personal HSTORE
);
SELECT create_distributed_table('users', 'session_id');
SELECT truncate_local_data_after_distributing_table('users');
DROP TABLE IF EXISTS segfault_reproduce.user_sessions CASCADE;
CREATE TABLE segfault_reproduce.user_sessions (
tenant_id VARCHAR(10) NOT NULL,
ts TIMESTAMP NOT NULL,
session_id TEXT NOT NULL
);
SELECT create_distributed_table('user_sessions', 'session_id');
SELECT truncate_local_data_after_distributing_table('user_sessions');
--
-- With the window function and no hstore condition and a subquery including inner join
-- OK
--
SELECT
userlist.tenant_id,
COUNT(userlist.customer_id) OVER (PARTITION BY userlist.tenant_id) AS "all_count"
FROM
users userlist
INNER JOIN
(
SELECT
user_sessions.TENANT_ID,
user_sessions.SESSION_ID
FROM
user_sessions
INNER JOIN
users
ON (user_sessions.TENANT_ID=users.TENANT_ID
AND user_sessions.SESSION_ID=users.SESSION_ID)) search_subquery
ON (userlist.tenant_id=search_subquery.tenant_id
AND search_subquery.session_id=userlist.session_id)
WHERE userlist.tenant_id='1234567890'
GROUP BY
userlist.tenant_id,
userlist.customer_id;
--
-- With the window function and no hstore condition and a subquery including outer join
-- OK
--
SELECT
userlist.tenant_id,
COUNT(userlist.customer_id) OVER (PARTITION BY userlist.tenant_id) AS "all_count"
FROM
users userlist
INNER JOIN
(
SELECT
user_sessions.TENANT_ID,
user_sessions.SESSION_ID
FROM
user_sessions
LEFT OUTER JOIN
users
ON (user_sessions.TENANT_ID=users.TENANT_ID
AND user_sessions.SESSION_ID=users.SESSION_ID)) search_subquery
ON (userlist.tenant_id=search_subquery.tenant_id
AND search_subquery.session_id=userlist.session_id)
WHERE userlist.tenant_id='1234567890'
GROUP BY
userlist.tenant_id,
userlist.customer_id;
--
-- With the window function and a hstore condition and a subquery including inner join
-- OK
--
SELECT
userlist.tenant_id,
COUNT(userlist.customer_id) OVER (PARTITION BY userlist.tenant_id) AS "all_count"
FROM
users userlist
INNER JOIN
(
SELECT
user_sessions.TENANT_ID,
user_sessions.SESSION_ID
FROM
user_sessions
INNER JOIN
users
ON (user_sessions.TENANT_ID=users.TENANT_ID
AND user_sessions.SESSION_ID=users.SESSION_ID)
WHERE users.PERSONAL->'foo'='bar') search_subquery
ON (userlist.tenant_id=search_subquery.tenant_id
AND search_subquery.session_id=userlist.session_id)
WHERE userlist.tenant_id='1234567890'
GROUP BY
userlist.tenant_id,
userlist.customer_id;
--
-- With no window function and a hstore condition and a subquery including outer join
-- OK
--
SELECT
userlist.tenant_id,
COUNT(userlist.customer_id) AS "all_count"
FROM
users userlist
INNER JOIN
(
SELECT
user_sessions.TENANT_ID,
user_sessions.SESSION_ID
FROM
user_sessions
LEFT OUTER JOIN
users
ON (user_sessions.TENANT_ID=users.TENANT_ID
AND user_sessions.SESSION_ID=users.SESSION_ID)
WHERE users.PERSONAL->'foo'='bar') search_subquery
ON (userlist.tenant_id=search_subquery.tenant_id
AND search_subquery.session_id=userlist.session_id)
WHERE userlist.tenant_id='1234567890'
GROUP BY
userlist.tenant_id,
userlist.customer_id;
--
-- With the window function and a hstore condition and a subquery including outer join
-- Segmentation Fault
--
SELECT
userlist.tenant_id,
COUNT(userlist.customer_id) OVER (PARTITION BY userlist.tenant_id) AS "all_count"
FROM
users userlist
INNER JOIN
(
SELECT
user_sessions.TENANT_ID,
user_sessions.SESSION_ID
FROM
user_sessions
LEFT OUTER JOIN
users
ON (user_sessions.TENANT_ID=users.TENANT_ID
AND user_sessions.SESSION_ID=users.SESSION_ID)
WHERE users.PERSONAL->'foo'='bar') search_subquery
ON (userlist.tenant_id=search_subquery.tenant_id
AND search_subquery.session_id=userlist.session_id)
WHERE userlist.tenant_id='1234567890'
GROUP BY
userlist.tenant_id,
userlist.customer_id;
DROP SCHEMA IF EXISTS segfault_reproduce CASCADE;
Logs:
2025-11-07 14:10:09 JST [633279]: [181-1] user=,db=,app=,client= LOG: server process (PID 571678) was terminated by signal 11: Segmentation fault
2025-11-07 14:10:09 JST [633279]: [182-1] user=,db=,app=,client= DETAIL: Failed process was running: SELECT
userlist.tenant_id,
COUNT(userlist.customer_id) OVER (PARTITION BY userlist.tenant_id) AS "all_count"
FROM
users userlist
INNER JOIN
(
SELECT
user_sessions.TENANT_ID,
user_sessions.SESSION_ID
FROM
user_sessions
LEFT OUTER JOIN
users
ON (user_sessions.TENANT_ID=users.TENANT_ID
AND user_sessions.SESSION_ID=users.SESSION_ID)
WHERE users.PERSONAL->'foo'='bar') search_subquery
ON (userlist.tenant_id=search_subquery.tenant_id
AND search_subquery.session_id=userlist.session_id)
WHERE userlist.tenant_id='1234567890'
GROUP BY
userlist.tenant_id,
userlist.customer_id;
2025-11-07 14:10:09 JST [633279]: [183-1] user=,db=,app=,client= LOG: terminating any other active server processes
2025-11-07 14:10:09 JST [572369]: [1-1] user=[unknown],db=[unknown],app=[unknown],client=[local] LOG: connection received: host=[local]
2025-11-07 14:10:09 JST [572369]: [2-1] user=xdata,db=xdata_dev_db,app=[unknown],client=[local] FATAL: the database system is in recovery mode
2025-11-07 14:10:09 JST [633279]: [184-1] user=,db=,app=,client= LOG: all server processes terminated; reinitializing
Also explain command.
Logs:
2025-11-07 15:29:39 JST [608480]: [21-1] user=,db=,app=,client= LOG: server process (PID 623865) was terminated by signal 11: Segmentation fault
2025-11-07 15:29:39 JST [608480]: [22-1] user=,db=,app=,client= DETAIL: Failed process was running: explain SELECT
userlist.tenant_id,
COUNT(userlist.customer_id) OVER (PARTITION BY userlist.tenant_id) AS "all_count"
FROM
users userlist
INNER JOIN
(
SELECT
user_sessions.TENANT_ID,
user_sessions.SESSION_ID
FROM
user_sessions
LEFT OUTER JOIN
users
ON (user_sessions.TENANT_ID=users.TENANT_ID
AND user_sessions.SESSION_ID=users.SESSION_ID)
WHERE users.PERSONAL->'foo'='bar') search_subquery
ON (userlist.tenant_id=search_subquery.tenant_id
AND search_subquery.session_id=userlist.session_id)
WHERE userlist.tenant_id='1234567890'
GROUP BY
userlist.tenant_id,
userlist.customer_id;
2025-11-07 15:29:39 JST [608480]: [23-1] user=,db=,app=,client= LOG: terminating any other active server processes
2025-11-07 15:29:39 JST [623913]: [1-1] user=[unknown],db=[unknown],app=[unknown],client=[local] LOG: connection received: host=[local]
2025-11-07 15:29:39 JST [623913]: [2-1] user=xdata,db=xdata_dev_db,app=[unknown],client=[local] FATAL: the database system is in recovery mode
2025-11-07 15:29:39 JST [608480]: [24-1] user=,db=,app=,client= LOG: all server processes terminated; reinitializing
Reproduced on 13.2.0 as well.
@xrossdata thanks for reporting. This looks like an execution-time problem, as the query runs ok on empty tables (below). Can you provide some INSERTs that help in reproducing ? We can try with synthetic data ourselves, but if you have some DML you can share it would be appreciated.
SELECT
userlist.tenant_id,
COUNT(userlist.customer_id) OVER (PARTITION BY userlist.tenant_id) AS "all_count"
FROM
users userlist
INNER JOIN
(
SELECT
user_sessions.TENANT_ID,
user_sessions.SESSION_ID
FROM
user_sessions
LEFT OUTER JOIN
users
ON (user_sessions.TENANT_ID=users.TENANT_ID
AND user_sessions.SESSION_ID=users.SESSION_ID)
WHERE users.PERSONAL->'foo'='bar') search_subquery
ON (userlist.tenant_id=search_subquery.tenant_id
AND search_subquery.session_id=userlist.session_id)
WHERE userlist.tenant_id='1234567890'
GROUP BY
userlist.tenant_id,
userlist.customer_id;
┌───────────┬───────────┐
│ tenant_id │ all_count │
├───────────┼───────────┤
└───────────┴───────────┘
Thanks for reply.
In our environment, SEGFAULT with or without data, so we don't believe it occurs with any particular data or combination of data.
I built PG17.5 in debug mode, re-build citus-13.2.0, and got the stack trace .
(gdb) c
Continuing.
Program received signal SIGSEGV, Segmentation fault.
GroupTargetEntryList (groupClauseList=0x24ea850, targetEntryList=0x22e33a0) at planner/multi_logical_optimizer.c:4389
4389 TargetEntry *groupTargetEntry =
(gdb) bt
#0 GroupTargetEntryList (groupClauseList=0x24ea850, targetEntryList=0x22e33a0) at planner/multi_logical_optimizer.c:4389
#1 0x00007f938bd44413 in WindowPartitionOnDistributionColumn (query=<optimized out>) at planner/query_pushdown_planning.c:490
#2 SafeToPushdownWindowFunction (query=query@entry=0x23cf5e8, errorDetail=errorDetail@entry=0x0) at planner/query_pushdown_planning.c:452
#3 0x00007f938bd3855a in MultiExtendedOpNode (queryTree=queryTree@entry=0x23cf5e8, originalQuery=originalQuery@entry=0x23cf5e8) at planner/multi_logical_planner.c:1766
#4 0x00007f938bd389c9 in MultiNodeTree (queryTree=queryTree@entry=0x23cf5e8) at planner/multi_logical_planner.c:713
#5 0x00007f938bd38d98 in MultiLogicalPlanCreate (originalQuery=originalQuery@entry=0x24ebf18, queryTree=queryTree@entry=0x23cf5e8, plannerRestrictionContext=plannerRestrictionContext@entry=0x22e0888) at planner/multi_logical_planner.c:161
#6 0x00007f938bd25e3e in CreateDistributedPlan (planId=planId@entry=5, allowRecursivePlanning=allowRecursivePlanning@entry=true, originalQuery=0x24ebf18, query=0x23cf5e8, boundParams=0x0, hasUnresolvedParams=hasUnresolvedParams@entry=false, plannerRestrictionContext=0x22e0888)
at planner/distributed_planner.c:1185
#7 0x00007f938bd265c3 in CreateDistributedPlannedStmt (planContext=planContext@entry=0x7ffdf6a06f70) at planner/distributed_planner.c:737
#8 0x00007f938bd26ca0 in PlanDistributedStmt (rteIdCounter=<optimized out>, planContext=0x7ffdf6a06f70) at planner/distributed_planner.c:672
#9 distributed_planner (parse=0x23cf5e8,
query_string=0x22273d8 "SELECT\n userlist.tenant_id,\n COUNT(userlist.customer_id) OVER (PARTITION BY userlist.tenant_id) AS \"all_count\"\nFROM\n users userlist\nINNER JOIN\n (\nSELECT\n user_sessions.TENANT_ID,\n us"..., cursorOptions=<optimized out>,
boundParams=<optimized out>) at planner/distributed_planner.c:273
#10 0x00007f938eb1e55c in pgss_planner () from /usr/local/pgsql17.5-2/lib/pg_stat_statements.so
#11 0x000000000082cdab in pg_plan_query (querytree=0x23cf5e8,
query_string=query_string@entry=0x22273d8 "SELECT\n userlist.tenant_id,\n COUNT(userlist.customer_id) OVER (PARTITION BY userlist.tenant_id) AS \"all_count\"\nFROM\n users userlist\nINNER JOIN\n (\nSELECT\n user_sessions.TENANT_ID,\n us"...,
cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at postgres.c:908
#12 0x000000000082ce91 in pg_plan_queries (querytrees=0x22ddb30,
query_string=query_string@entry=0x22273d8 "SELECT\n userlist.tenant_id,\n COUNT(userlist.customer_id) OVER (PARTITION BY userlist.tenant_id) AS \"all_count\"\nFROM\n users userlist\nINNER JOIN\n (\nSELECT\n user_sessions.TENANT_ID,\n us"...,
cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at postgres.c:1000
#13 0x000000000082d146 in exec_simple_query (
query_string=0x22273d8 "SELECT\n userlist.tenant_id,\n COUNT(userlist.customer_id) OVER (PARTITION BY userlist.tenant_id) AS \"all_count\"\nFROM\n users userlist\nINNER JOIN\n (\nSELECT\n user_sessions.TENANT_ID,\n us"...) at postgres.c:1197
#14 0x000000000082ed6a in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4767
#15 0x0000000000829daf in BackendMain (startup_data=<optimized out>, startup_data_len=<optimized out>) at backend_startup.c:105
#16 0x00000000007a4aaa in postmaster_child_launch (child_type=child_type@entry=B_BACKEND, startup_data=startup_data@entry=0x7ffdf6a078bc "", startup_data_len=startup_data_len@entry=4, client_sock=client_sock@entry=0x7ffdf6a078c0) at launch_backend.c:277
#17 0x00000000007a7fe1 in BackendStartup (client_sock=0x7ffdf6a078c0) at postmaster.c:3594
#18 ServerLoop () at postmaster.c:1676
#19 0x00000000007a9bea in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x21e2740) at postmaster.c:1374
#20 0x000000000050d77b in main (argc=3, argv=0x21e2740) at main.c:199
Here are some configure options and other details:
- PG configure options :
--prefix=/usr/local/pgsql17.5-2 --with-openssl --with-perl --with-llvm --with-systemd --enable-debug - gcc-11.5.0-5.el9_5.x86_64
- openssl-libs-3.2.2-6.el9_5.1.x86_64
- llvm-18.1.8-3.el9.x86_64
- citus settings in postgresql.conf
shared_preload_libraries = 'citus,pg_stat_statements'
wal_level = logical
PG17.5(debug) with citus-13.1.1.
(gdb) c
Continuing.
Program received signal SIGSEGV, Segmentation fault.
GroupTargetEntryList (groupClauseList=0x19ab650, targetEntryList=0x17f33a0) at planner/multi_logical_optimizer.c:4389
4389 TargetEntry *groupTargetEntry =
(gdb) bt
#0 GroupTargetEntryList (groupClauseList=0x19ab650, targetEntryList=0x17f33a0) at planner/multi_logical_optimizer.c:4389
#1 0x00007f480d349373 in WindowPartitionOnDistributionColumn (query=<optimized out>) at planner/query_pushdown_planning.c:489
#2 SafeToPushdownWindowFunction (query=query@entry=0x186cff8, errorDetail=errorDetail@entry=0x0) at planner/query_pushdown_planning.c:451
#3 0x00007f480d33d9aa in MultiExtendedOpNode (queryTree=queryTree@entry=0x186cff8, originalQuery=originalQuery@entry=0x186cff8) at planner/multi_logical_planner.c:1762
#4 0x00007f480d33de19 in MultiNodeTree (queryTree=queryTree@entry=0x186cff8) at planner/multi_logical_planner.c:709
#5 0x00007f480d33e1e8 in MultiLogicalPlanCreate (originalQuery=originalQuery@entry=0x19acd18, queryTree=queryTree@entry=0x186cff8, plannerRestrictionContext=plannerRestrictionContext@entry=0x17f0888) at planner/multi_logical_planner.c:157
#6 0x00007f480d32bbde in CreateDistributedPlan (planId=planId@entry=5, allowRecursivePlanning=allowRecursivePlanning@entry=true, originalQuery=0x19acd18, query=0x186cff8, boundParams=0x0, hasUnresolvedParams=hasUnresolvedParams@entry=false, plannerRestrictionContext=0x17f0888)
at planner/distributed_planner.c:1204
#7 0x00007f480d32c353 in CreateDistributedPlannedStmt (planContext=planContext@entry=0x7fff7d104b10) at planner/distributed_planner.c:759
#8 0x00007f480d32ca28 in PlanDistributedStmt (rteIdCounter=<optimized out>, planContext=0x7fff7d104b10) at planner/distributed_planner.c:694
#9 distributed_planner (parse=0x186cff8,
query_string=0x17373d8 "SELECT\n userlist.tenant_id,\n COUNT(userlist.customer_id) OVER (PARTITION BY userlist.tenant_id) AS \"all_count\"\nFROM\n users userlist\nINNER JOIN\n (\nSELECT\n user_sessions.TENANT_ID,\n us"..., cursorOptions=<optimized out>,
boundParams=<optimized out>) at planner/distributed_planner.c:286
#10 0x00007f481072a55c in pgss_planner () from /usr/local/pgsql17.5-2/lib/pg_stat_statements.so
#11 0x000000000082cdab in pg_plan_query (querytree=0x186cff8,
query_string=query_string@entry=0x17373d8 "SELECT\n userlist.tenant_id,\n COUNT(userlist.customer_id) OVER (PARTITION BY userlist.tenant_id) AS \"all_count\"\nFROM\n users userlist\nINNER JOIN\n (\nSELECT\n user_sessions.TENANT_ID,\n us"...,
cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at postgres.c:908
#12 0x000000000082ce91 in pg_plan_queries (querytrees=0x17edb30,
query_string=query_string@entry=0x17373d8 "SELECT\n userlist.tenant_id,\n COUNT(userlist.customer_id) OVER (PARTITION BY userlist.tenant_id) AS \"all_count\"\nFROM\n users userlist\nINNER JOIN\n (\nSELECT\n user_sessions.TENANT_ID,\n us"...,
cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at postgres.c:1000
#13 0x000000000082d146 in exec_simple_query (
query_string=0x17373d8 "SELECT\n userlist.tenant_id,\n COUNT(userlist.customer_id) OVER (PARTITION BY userlist.tenant_id) AS \"all_count\"\nFROM\n users userlist\nINNER JOIN\n (\nSELECT\n user_sessions.TENANT_ID,\n us"...) at postgres.c:1197
#14 0x000000000082ed6a in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4767
#15 0x0000000000829daf in BackendMain (startup_data=<optimized out>, startup_data_len=<optimized out>) at backend_startup.c:105
#16 0x00000000007a4aaa in postmaster_child_launch (child_type=child_type@entry=B_BACKEND, startup_data=startup_data@entry=0x7fff7d10545c "", startup_data_len=startup_data_len@entry=4, client_sock=client_sock@entry=0x7fff7d105460) at launch_backend.c:277
#17 0x00000000007a7fe1 in BackendStartup (client_sock=0x7fff7d105460) at postmaster.c:3594
#18 ServerLoop () at postmaster.c:1676
#19 0x00000000007a9bea in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x16f2740) at postmaster.c:1374
#20 0x000000000050d77b in main (argc=3, argv=0x16f2740) at main.c:199
(gdb)
Thanks @xrossdata we were able to reproduce the issue with this information. The problem is caused by Citus not handling an optimization that Postgres 17 applies to the window clause of the query; its partition clause can be changed from a SortGroupClause to a Path node by the Postgres planner, and Citus needs to handle this when its trying to determine if the window partition is on a distribution column. It currently assumes all partition clauses are sort group references. The issue does not happen when LEFT JOIN is changed to INNER JOIN because Postgres does not alter the window's partition clause. So something for us to investigate further and fix.
It looks like the problem may be caused by a dangling pointer; debugging the Postgres planner shows that optimization c651020 is applied to the windows clause, so that it has a zero-length (or NIL) partitionClause. However, after stepping out of function set_plan_references() , the paritionClause appears to be corrupted:
-exec p * (List* ) node->partitionClause
$3 = {type = T_List, length = 16724, max_length = 6, elements = 0x415b00000005, initial_elements = 0x55ab26d69b70}
Its not (yet) clear how this happens, but it looks like a memory overwrite, leading to a corrupted address. It does not cause a problem for Postgres, probably because the partitionClause field is not dereferenced by the executor. But Citus examines it so hits a SEGFAULT. For reference 9d9c02c is a recent(ish) PG commit that may be relevant.