citus
citus copied to clipboard
Potential Memory Leak with `jsonb_build_object()` on Columnar Tables
PosgreSQL's jsonb_build_object() over columnar tables consumes excessive memory and leads to OOM crash.
Hi team, I'm running into what appears to be a memory leak when executing a GROUP BY query using jsonb_build_object() over a few columnar tables on:
- PostgreSQL: 16
- Citus: 13
- Hosted on a single EC2 instance with ~250 GB RAM.
Symptoms:
The query runs for ~3 minutes, consuming:
- 155GB RES
- 290GB VIRT
- 63%+ of total memory
Once memory usage crosses 206GB+, the server eventually crashes with an OOM error if not manually killed. The query itself is a fairly typical OLAP-style aggregation.
Query Summary:
The pattern (simplified) looks like:
WITH filtered_data AS (
SELECT ...
FROM columnar_table_1 AS a
JOIN columnar_table_2 AS b ON ...
LEFT JOIN columnar_table_3 AS c ON ...
WHERE c.some_json_field ->> 'some_key' = 'some_value'
)
SELECT
jsonb_build_object(...),
some_keys,
SUM(...),
VARIANCE(...)
FROM filtered_data
GROUP BY ...;
- All input tables are columnar.
- Two of the inputs are joined on multiple fields, and the main aggregation is done after a
jsonb_build_object()call in theSELECT.
Observations:
- The problem only occurs when
jsonb_build_object()is used. - No UDFs or temporary tables are involved.
- Setting
max_parallel_workers_per_gather = 0and disablingcolumnar.enable_custom_scandoesn't resolve the issue.
Question:
- Is this a known limitation or a bug in how Citus handles
jsonb_build_object()over columnar data with parallel execution or row-materialization? - Any recommendations on safely constructing JSON from large columnar datasets?