postgres[495645]: segfault at 0 ip 00007f318b17e1f4 sp 00007ffc7f1b15d8 error 4 in citus.so[7f318b0a4000+ee000] likely on CPU 93 (core 1, socket 1)
rocky9.3 4 sockets machine
scenario was:
- a few months ago install + deployed user data pg 16.2 + citus_16-12.1.3-1PGDG.rhel9.x86_64
- 2 days ago, updated to 16.3
"server process (PID 487952) was terminated by signal 11: Segmentation fault","Failed process was running: select ct.conname as constraint_name, a.attname as column_name, fc.relname as foreign_table_name, fns.nspname as foreign_table_schema, fa.attname as foreign_column_name from (SELECT ct.conname, ct.conrelid, ct.confrelid, ct.conkey, ct.contype, ct.confkey, generate_subscripts(ct.conkey, 1) AS s FROM pg_constraint ct ) AS ct inner join pg_class c on c.oid=ct.conrelid inner join pg_namespace ns on c.relnamespace=ns.oid inner join pg_attribute a on a.attrelid=ct.conrelid and a.attnum = ct.conkey[ct.s] left join pg_class fc on fc.oid=ct.confrelid left join pg_namespace fns on fc.relnamespace=fns.oid left join pg_attribute fa on fa.attrelid=ct.confrelid and fa.attnum = ct.confkey[ct.s] where ct.contype='f' and c.relname='table1' and ns.nspname='schemauser' order by fns.nspname, fc.relname, a.attnum ;
"terminating any other active server processes",,,,,,,,,"","postmaster",,0
OS log said: postgres[495645]: segfault at 0 ip 00007f318b17e1f4 sp 00007ffc7f1b15d8 error 4 in citus.so[7f318b0a4000+ee000] likely on CPU 93 (core 1, socket 1)
after drop the cistus extension, the segfault gone. please let me know, am I missing something after update minor update on pg?
tried drop and create extension, it was same.
Have been seeing something similar on PG 16 on WSL Ubuntu 24.04 and 22.04. Introspection crashes the DB at the select query. CPU: i9-13980HX Latest Win11, WSL 2, etc.
Yes, this was a regression in 16.1.3 we're working on releasing a fix. See #7604
The mentioned pull request is merged and version is bumped, but this issue is still open and 12.1.4 release is listed neither in https://github.com/citusdata/citus/releases nor in https://www.citusdata.com/updates/v12-1 That's confusing
Thanks @Green-Chan for the heads up. I added the release. https://github.com/citusdata/citus/releases/tag/v12.1.4 Closing the issue
yesterday I updated to citus 12.1.4 and problem still exists. but the root cause is not citus, i guess due to numa or transparent_hugepage settings.
change rocky9 default boot param :
- disable transparent_hugepage
- numa=off
now error is gone.