Segmentation fault issues in pglogical version 2.4.6
I continue getting the same crash problem - https://github.com/2ndQuadrant/pglogical/issues/446 & https://github.com/2ndQuadrant/pglogical/issues/367 while syncing non-partition tables to partition based tables in latest pglogical version 2.4.6 as well.
2025-11-02 07:25:56.842 UTC [2651842] LOG: background worker "pglogical apply 16593:2528659118" (PID 1158871) was terminated by signal 11: Segmentation fault
Already tried the workaround by adding the non-partition table in the provider again with synchronize_data := false however this doesn't fix the issue.
SELECT pglogical.replication_set_add_table(
set_name := 'repset',
relation := 'public.part',
synchronize_data := false
);
SELECT pglogical.replication_set_add_all_tables(
set_name := 'repset',
schema_names := ARRAY['public'],
synchronize_data := false
);
I understand the schema mismatch can cause problems but there should be some better way to handle such in consistency related error as this cause continuous crash on subscriber node due to which impossible to run any query/command there. The only solution we found to stop the node from crashing is stopping the subscriber node and then dropping the slot from the publisher.
source=# SELECT slot_name, active, restart_lsn
FROM pg_replication_slots
WHERE slot_name LIKE '%s_rep%';
slot_name | active | restart_lsn
------------------------+--------+-------------
pgl_target__ode1_s_rep | t | 0/1B7D230
(1 row)
SELECT pg_drop_replication_slot('pgl_target__ode1_s_rep');
I tested this scenario on multiple versions of both pglogical and postgres however the outcome was same crash.
pglogical versions (2.4.4, 2.4.5, 2.4.6)
PostgreSQL versions (16.X, 17.x)
Root Cause Analysis: Generated Columns Crash
After extensive debugging, I've identified a root cause for crashes involving tables with filtered column lists (using replication_set_add_table with a columns parameter).
The Problem
When a table has columns excluded from replication (e.g., a generated column), fill_missing_defaults() in pglogical_apply_heap.c attempts to evaluate "defaults" for those missing columns. For generated columns (PostgreSQL 12+), this causes a segfault because:
- Generated columns don't have traditional defaults - they have generation expressions
- The generation expression depends on other column values that may not be properly accessible in this context
build_column_default()returns the generation expression, andExecEvalExpr()crashes when evaluating it
Reproduction
- Table with a
GENERATED ALWAYS AS ... STOREDcolumn - Exclude the generated column via
replication_set_add_table(..., columns := ARRAY[...]) - Perform an UPDATE on the provider
- Subscriber crashes with SIGSEGV in the apply worker
Proposed Fix
Skip generated columns in fill_missing_defaults() since PostgreSQL automatically computes them during INSERT/UPDATE:
// pglogical_apply_heap.c, in fill_missing_defaults(), inside the for loop:
Form_pg_attribute att = TupleDescAttr(desc, attnum);
if (att->attisdropped)
continue;
if (physatt_in_attmap(rel, attnum))
continue;
/* Skip generated columns - computed automatically by PostgreSQL */
#if PG_VERSION_NUM >= 120000
if (att->attgenerated)
continue;
#endif
defexpr = (Expr *) build_column_default(rel->rel, attnum + 1);
This fix is minimal, backwards-compatible (guarded for PG12+), and resolves the crash in my testing. Happy to submit a PR if this approach looks reasonable.
Environment
- PostgreSQL 18.1
- pglogical 2.4.6
- macOS