pgrx icon indicating copy to clipboard operation
pgrx copied to clipboard

Segmentation fault when running pgx schema under CentOS 7

Open sumerman opened this issue 3 years ago • 11 comments
trafficstars

Having a weird segfault when building our extension under CentOS 7:

cargo pgx schema pg14 --out sql/promscale--0.5.0.sql --release fails right after it outputs

Discovered 32 SQL entities: 8 schemas (1 unique), 15 functions, 1 types, 0 enums, 8 sqls, 0 ords, 0 hashes, 0 aggregates

Stacktrace:

#0  0x0000556360e83f59 in _$LT$hashbrown..raw..RawTable$LT$T$C$A$GT$$u20$as$u20$core..clone..Clone$GT$::clone::hf1a3b5f5c3938bdd ()
#1  0x0000556360e60019 in cargo_pgx::command::schema::generate_schema::hf859e0c62c7b6ffe ()
#2  0x0000556360e4c855 in _$LT$cargo_pgx..command..schema..Schema$u20$as$u20$cargo_pgx..CommandExecute$GT$::execute::h2b5fc229f7ddfe8f ()
#3  0x0000556360e81333 in _$LT$cargo_pgx..command..pgx..Pgx$u20$as$u20$cargo_pgx..CommandExecute$GT$::execute::h357ae0436882b1ea ()
#4  0x0000556360e9a713 in cargo_pgx::main::hd41f23561fe6f6ed ()
#5  0x0000556360e00033 in std::sys_common::backtrace::__rust_begin_short_backtrace::h19ae7bf5592d47e6 ()
#6  0x0000556360e2e35d in std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h4fa81625e68f042a ()
#7  0x000055636130a65b in call_once<(), (dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> () at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/core/src/ops/function.rs:259
#8  do_call<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> ()
    at library/std/src/panicking.rs:403
#9  try<i32, &(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> ()
    at library/std/src/panicking.rs:367
#10 catch_unwind<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> ()
    at library/std/src/panic.rs:133
#11 {closure#2} () at library/std/src/rt.rs:128
#12 do_call<std::rt::lang_start_internal::{closure#2}, isize> () at library/std/src/panicking.rs:403
#13 try<isize, std::rt::lang_start_internal::{closure#2}> () at library/std/src/panicking.rs:367
#14 catch_unwind<std::rt::lang_start_internal::{closure#2}, isize> () at library/std/src/panic.rs:133
#15 std::rt::lang_start_internal::hd15a47be08101c28 () at library/std/src/rt.rs:128
#16 0x0000556360e9c392 in main ()

It does not seem to be related to #371

To reproduce the issue you could use dist/rpm.dockerfile from this PR:

docker build --target builder-cache -t debug-build --build-arg OS_NAME=centos --build-arg OS_VERSION=7 --build-arg PG_VERSION=14 --build-arg RUST_VERSION=1.57.0 --build-arg RELEASE_FILE_NAME=promscale-extension-0.5.0.pg14.centos7.x86_64.rpm -f dist/rpm.dockerfile .
docker run --rm -it debug-build bash

then you can run cargo pgx schema pg14 --out sql/promscale--0.5.0.sql --release inside

sumerman avatar May 23 '22 18:05 sumerman

do you also get a segfault without --release? That might make the stack trace a lot better...

eeeebbbbrrrr avatar May 23 '22 18:05 eeeebbbbrrrr

do you also get a segfault without --release? That might make the stack trace a lot better...

Not much better, unfortunately (looks exactly the same to me):

#0  0x0000559798683f59 in _$LT$hashbrown..raw..RawTable$LT$T$C$A$GT$$u20$as$u20$core..clone..Clone$GT$::clone::hf1a3b5f5c3938bdd ()
#1  0x0000559798660019 in cargo_pgx::command::schema::generate_schema::hf859e0c62c7b6ffe ()
#2  0x000055979864c855 in _$LT$cargo_pgx..command..schema..Schema$u20$as$u20$cargo_pgx..CommandExecute$GT$::execute::h2b5fc229f7ddfe8f ()
#3  0x0000559798681333 in _$LT$cargo_pgx..command..pgx..Pgx$u20$as$u20$cargo_pgx..CommandExecute$GT$::execute::h357ae0436882b1ea ()
#4  0x000055979869a713 in cargo_pgx::main::hd41f23561fe6f6ed ()
#5  0x0000559798600033 in std::sys_common::backtrace::__rust_begin_short_backtrace::h19ae7bf5592d47e6 ()
#6  0x000055979862e35d in std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h4fa81625e68f042a ()
#7  0x0000559798b0a65b in call_once<(), (dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> () at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/core/src/ops/function.rs:259
#8  do_call<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> ()
    at library/std/src/panicking.rs:403
#9  try<i32, &(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> ()
    at library/std/src/panicking.rs:367
#10 catch_unwind<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> ()
    at library/std/src/panic.rs:133
#11 {closure#2} () at library/std/src/rt.rs:128
#12 do_call<std::rt::lang_start_internal::{closure#2}, isize> () at library/std/src/panicking.rs:403
#13 try<isize, std::rt::lang_start_internal::{closure#2}> () at library/std/src/panicking.rs:367
#14 catch_unwind<std::rt::lang_start_internal::{closure#2}, isize> () at library/std/src/panic.rs:133
#15 std::rt::lang_start_internal::hd15a47be08101c28 () at library/std/src/rt.rs:128
#16 0x000055979869c392 in main ()

sumerman avatar May 23 '22 18:05 sumerman

I had to re-install cargo-pgx with --debug to get a better stacktrace:

#0  core::clone::impls::_$LT$impl$u20$core..clone..Clone$u20$for$u20$u64$GT$::clone::ha646e36ad2dd08c1 (self=0x7fac2f4bd048)
    at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/core/src/clone.rs:185
#1  0x000055a4793f22b7 in _$LT$std..collections..hash..map..RandomState$u20$as$u20$core..clone..Clone$GT$::clone::h3c0527dfc1747d34 (
    self=0x7fac2f4bd048) at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/std/src/collections/hash/map.rs:2882
#2  0x000055a4794b2e75 in _$LT$hashbrown..map..HashMap$LT$K$C$V$C$S$GT$$u20$as$u20$core..clone..Clone$GT$::clone::hb2878e43a3667026 (
    self=0x7fac2f4bd048) at /cargo/registry/src/github.com-1ecc6299db9ec823/hashbrown-0.11.0/src/map.rs:200
#3  0x000055a4793f211d in _$LT$hashbrown..set..HashSet$LT$T$C$S$GT$$u20$as$u20$core..clone..Clone$GT$::clone::h528a62b8b892c18d (
    self=0x7fac2f4bd048) at /cargo/registry/src/github.com-1ecc6299db9ec823/hashbrown-0.11.0/src/set.rs:122
#4  0x000055a479459ced in _$LT$std..collections..hash..set..HashSet$LT$T$C$S$GT$$u20$as$u20$core..clone..Clone$GT$::clone::h8a480bd586feca8b (
    self=0x7fac2f4bd048) at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/std/src/collections/hash/set.rs:942
#5  0x000055a478f9c4a9 in cargo_pgx::command::schema::generate_schema::h15c6f324857bfc6a (pg_config=0x7fffe8a433c0, user_manifest_path=...,
    user_package=..., package_manifest_path=..., is_release=true, is_test=false, features=0x7fffe8a43a30, path=..., dot=..., log_level=...,
    skip_build=false) at /home/builder/.cargo/registry/src/github.com-1ecc6299db9ec823/cargo-pgx-0.4.4/src/command/schema.rs:403
#6  0x000055a478f8c36b in _$LT$cargo_pgx..command..schema..Schema$u20$as$u20$cargo_pgx..CommandExecute$GT$::execute::hb3b79c6321f6619c (self=...)
    at /home/builder/.cargo/registry/src/github.com-1ecc6299db9ec823/cargo-pgx-0.4.4/src/command/schema.rs:116
#7  0x000055a478fd5c6a in _$LT$cargo_pgx..command..pgx..CargoPgxSubCommands$u20$as$u20$cargo_pgx..CommandExecute$GT$::execute::h948df9d8f6fd95da (
    self=...) at /home/builder/.cargo/registry/src/github.com-1ecc6299db9ec823/cargo-pgx-0.4.4/src/command/pgx.rs:56
#8  0x000055a478fd54b3 in _$LT$cargo_pgx..command..pgx..Pgx$u20$as$u20$cargo_pgx..CommandExecute$GT$::execute::h5828bf6f780956cb (self=...)
    at /home/builder/.cargo/registry/src/github.com-1ecc6299db9ec823/cargo-pgx-0.4.4/src/command/pgx.rs:24
#9  0x000055a4790002bd in _$LT$cargo_pgx..CargoSubcommands$u20$as$u20$cargo_pgx..CommandExecute$GT$::execute::h6e306bad8eea3ad7 (self=...)
    at /home/builder/.cargo/registry/src/github.com-1ecc6299db9ec823/cargo-pgx-0.4.4/src/main.rs:49
#10 0x000055a479000263 in _$LT$cargo_pgx..CargoCommand$u20$as$u20$cargo_pgx..CommandExecute$GT$::execute::hf8e2912a976c3113 (self=...)
    at /home/builder/.cargo/registry/src/github.com-1ecc6299db9ec823/cargo-pgx-0.4.4/src/main.rs:36
#11 0x000055a479002074 in cargo_pgx::main::h34a91cb92aaf6bc9 ()
    at /home/builder/.cargo/registry/src/github.com-1ecc6299db9ec823/cargo-pgx-0.4.4/src/main.rs:101
#12 0x000055a478eab9db in core::ops::function::FnOnce::call_once::hc233d67e1da49bbe ()
    at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/core/src/ops/function.rs:227
#13 0x000055a478e4d10e in std::sys_common::backtrace::__rust_begin_short_backtrace::h3325f6e8898159f7 (
    f=0x55a4790002d0 <cargo_pgx::main::h34a91cb92aaf6bc9>)
    at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/std/src/sys_common/backtrace.rs:123
#14 0x000055a478df5f91 in std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h8716a32f8d64fc97 ()
    at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/std/src/rt.rs:146
#15 0x000055a479d1831b in call_once<(), (dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> () at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/core/src/ops/function.rs:259
#16 do_call<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> ()
    at library/std/src/panicking.rs:403
#17 try<i32, &(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> ()
    at library/std/src/panicking.rs:367
#18 catch_unwind<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> ()
    at library/std/src/panic.rs:133
#19 {closure#2} () at library/std/src/rt.rs:128
#20 do_call<std::rt::lang_start_internal::{closure#2}, isize> () at library/std/src/panicking.rs:403
#21 try<isize, std::rt::lang_start_internal::{closure#2}> () at library/std/src/panicking.rs:367
#22 catch_unwind<std::rt::lang_start_internal::{closure#2}, isize> () at library/std/src/panic.rs:133
#23 std::rt::lang_start_internal::hd15a47be08101c28 () at library/std/src/rt.rs:128
#24 0x000055a478df5f60 in std::rt::lang_start::hb6b67c9c64bebf02 (main=0x55a4790002d0 <cargo_pgx::main::h34a91cb92aaf6bc9>, argc=7,
    argv=0x7fffe8a4f8a8) at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/std/src/rt.rs:145
#25 0x000055a479011c5c in main ()

sumerman avatar May 24 '22 08:05 sumerman

I managed to reproduce this on our aggregate example using a fresh CentOS 7 VM. It's... interesting.

It seems to be originating from here, though I haven't hazarded why: https://github.com/tcdi/pgx/blob/1a89778046547c52c714fe6885069eab5638f965/pgx-utils/src/sql_entity_graph/pgx_sql.rs#L221-L222

https://github.com/tcdi/pgx/blob/1a89778046547c52c714fe6885069eab5638f965/cargo-pgx/src/command/schema.rs#L403-L405

It also renders my terminal unusable... image

I was noodling around and setting those to Default::default() and started to get different errors about memcmp and the like. This made me suspicious because those symbols should most definitely be loaded.

So I moved the drop guard for the libraries and indeed I got an entirely different symbol error. It seems we might have bumped into a glibc 2.17 bug around dlclose taking with it more than is needed.

Using std::mem::forget on the libraries solves the issue, so I'm pondering what is the best way to actually do this.

Hoverbear avatar May 24 '22 18:05 Hoverbear

@sumerman could you give that fix a shot?

Hoverbear avatar May 24 '22 19:05 Hoverbear

@Hoverbear I'm not able to upgrade pgx to 0.4.5 within that container, I keep getting

error: there is no argument named `message`
   --> /home/builder/.cargo/registry/src/github.com-1ecc6299db9ec823/pgx-pg-sys-0.4.5/src/submodules/guard.rs:275:21
    |
275 |         Ok(format!("{message} at {filename}:{lineno}:{colno}"))
    |                     ^^^^^^^^^

sumerman avatar May 25 '22 15:05 sumerman

Please upgrade your Rust version. Looks like we used format!("{val}") which is I believe Rust 1.58.

Hoverbear avatar May 25 '22 16:05 Hoverbear

I hate myself a little bit as I accidently pushed our MSRV bump to develop, but thanks for catching it: https://github.com/tcdi/pgx/commit/0da3b3f5f53751c4ce5f4aeaac3a24ef34036958

Hoverbear avatar May 25 '22 16:05 Hoverbear

Thank you! Seems like it did the trick locally in Docker. Trying to figure out what's up with our CI.

sumerman avatar May 25 '22 17:05 sumerman

Never mind, I missed passing the pg-version feature in one of the places. It's all good 👍🏼

sumerman avatar May 25 '22 18:05 sumerman

(Noting this is still open as we haven't released #573...)

Hoverbear avatar Jun 28 '22 20:06 Hoverbear