Server crashes if parallel run "scan_postgres_tables" test
What happens?
If we parallel run "scan_postgres_tables" test like below, server will crash PostgreSQL build:
CPPFLAGS="-Og -fsanitize=address -fsanitize=undefined -fno-sanitize-recover=all -fno-sanitize=nonnull-attribute -fstack-protector" \
LDFLAGS='-fsanitize=address -fsanitize=undefined -static-libasan' \
./configure --enable-crash-info --enable-tap-tests --with-openssl --enable-debug --enable-cassert --with-icu --with-lz4 --with-libxml
export ASAN_OPTIONS=detect_stack_use_after_return=0:detect_leaks=0:abort_on_error=1:disable_coredump=0:strict_string_checks=1:check_initialization_order=1:strict_init_order=1:detect_odr_violation=0
To Reproduce
Patch test/regression/schedule:
test: scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables
Then execute
make installcheck
regression.out:
# parallel group (15 tests): scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables scan_postgres_tables
not ok 1 + scan_postgres_tables 22236 ms
# (test process exited with exit code 2)
not ok 2 + scan_postgres_tables 22045 ms
# (test process exited with exit code 2)
not ok 3 + scan_postgres_tables 22260 ms
# (test process exited with exit code 2)
not ok 4 + scan_postgres_tables 22274 ms
# (test process exited with exit code 2)
not ok 5 + scan_postgres_tables 22153 ms
# (test process exited with exit code 2)
not ok 6 + scan_postgres_tables 22256 ms
# (test process exited with exit code 2)
not ok 7 + scan_postgres_tables 22215 ms
# (test process exited with exit code 2)
not ok 8 + scan_postgres_tables 22263 ms
# (test process exited with exit code 2)
not ok 9 + scan_postgres_tables 22115 ms
# (test process exited with exit code 2)
not ok 10 + scan_postgres_tables 25378 ms
# (test process exited with exit code 2)
not ok 11 + scan_postgres_tables 22264 ms
# (test process exited with exit code 2)
not ok 12 + scan_postgres_tables 22265 ms
# (test process exited with exit code 2)
not ok 13 + scan_postgres_tables 22258 ms
# (test process exited with exit code 2)
not ok 14 + scan_postgres_tables 22240 ms
# (test process exited with exit code 2)
not ok 15 + scan_postgres_tables 22246 ms
# (test process exited with exit code 2)
1..15
# 15 of 15 tests failed.
backtrace:
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1 0x00007f77b41c9f4f in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2 0x00007f77b417afb2 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x00007f77b4165472 in __GI_abort () at ./stdlib/abort.c:79
#4 0x0000558b2551c51f in __sanitizer::Abort() ()
#5 0x0000558b25528bb1 in __sanitizer::Die() ()
#6 0x0000558b25507f6e in __asan::ScopedInErrorReport::~ScopedInErrorReport() ()
#7 0x0000558b255074d6 in __asan::ReportGenericError(unsigned long, unsigned long, unsigned long, unsigned long, bool, unsigned long, unsigned int, bool) ()
#8 0x0000558b255085bc in __asan_report_load8 ()
#9 0x00007f77af5013c9 in pgduckdb::PostgresTableReader::GetNextTuple (this=this@entry=0x607000090b70) at src/scan/postgres_table_reader.cpp:271
#10 0x00007f77af4e4ce0 in pgduckdb::PostgresScanTableFunction::PostgresScanFunction (data=..., output=...) at src/scan/postgres_scan.cpp:261
#11 0x00007f77ad47903c in duckdb::PhysicalTableScan::GetData(duckdb::ExecutionContext&, duckdb::DataChunk&, duckdb::OperatorSourceInput&) const () from /tmp/pgsql/lib/libduckdb.so
#12 0x00007f77ad5fb9ab in duckdb::PipelineExecutor::FetchFromSource(duckdb::DataChunk&) () from /tmp/pgsql/lib/libduckdb.so
#13 0x00007f77ad605de7 in duckdb::PipelineExecutor::Execute(unsigned long) () from /tmp/pgsql/lib/libduckdb.so
#14 0x00007f77ad60611f in duckdb::PipelineTask::ExecuteTask(duckdb::TaskExecutionMode) () from /tmp/pgsql/lib/libduckdb.so
#15 0x00007f77ad5fd1a1 in duckdb::ExecutorTask::Execute(duckdb::TaskExecutionMode) () from /tmp/pgsql/lib/libduckdb.so
#16 0x00007f77ad604f52 in duckdb::TaskScheduler::ExecuteForever(std::atomic<bool>*) () from /tmp/pgsql/lib/libduckdb.so
#17 0x00007f77b30d44a3 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#18 0x00007f77b41c81f5 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#19 0x00007f77b424889c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
OS:
Debian 12 x86_64
pg_duckdb Version (if built from source use commit hash):
e99ab5d50e59717a73d6a29dbc364e26ebfb83ea
Postgres Version (if built from source use commit hash):
b19893b94bdea3b206cb544619d84cea6276f648
Hardware:
No response
Full Name:
Egor Chindyaskin
Affiliation:
Postgres Professional
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a source build
Did you include all relevant data sets for reproducing the issue?
Yes
Did you include all code required to reproduce the issue?
- [ ] Yes, I have
Did you include all relevant configuration (e.g., CPU architecture, Linux distribution) to reproduce the issue?
- [ ] Yes, I have
I tried reproducing this issue myself, but I was unable to. Does this still reproduce for you?
@JelteF, i am still able to reproduce this crash
tried to execute make installcheck in loop
for i in `seq 10000`;do make installcheck;if coredumpctl;then break;fi;done
#0 0x00007fc1bf53fdd7 in _dl_fixup (l=0x7fc1b9fb6460, reloc_arg=1144) at ./elf/dl-runtime.c:48
#1 0x00007fc1bf5422ba in _dl_runtime_resolve_xsavec () at ../sysdeps/x86_64/dl-trampoline.h:130
#2 0x0000557c4a40e894 in CopyErrorData () at elog.c:1762
#3 0x00007fc1b9f1f228 in pgduckdb::__PostgresFunctionGuard__<TupleTableSlot* (*)(PlanState*), ExecProcNode, PlanState*> (
func_name=func_name@entry=0x7fc1b9fb6d80 "ExecProcNode") at /usr/include/c++/12/bits/new_allocator.h:80
#4 0x00007fc1b9f20d39 in pgduckdb::PostgresTableReader::GetNextTuple (this=this@entry=0x6070000cb420) at src/scan/postgres_table_reader.cpp:272
#5 0x00007fc1b9f04566 in pgduckdb::PostgresScanTableFunction::PostgresScanFunction (data=..., output=...) at src/scan/postgres_scan.cpp:261
#6 0x00007fc1b7e73d2a in duckdb::PhysicalTableScan::GetData(duckdb::ExecutionContext&, duckdb::DataChunk&, duckdb::OperatorSourceInput&) const ()
from /tmp/pgsql/lib/libduckdb.so
#7 0x00007fc1b7ff731b in duckdb::PipelineExecutor::FetchFromSource(duckdb::DataChunk&) () from /tmp/pgsql/lib/libduckdb.so
#8 0x00007fc1b8001657 in duckdb::PipelineExecutor::Execute(unsigned long) () from /tmp/pgsql/lib/libduckdb.so
#9 0x00007fc1b800198f in duckdb::PipelineTask::ExecuteTask(duckdb::TaskExecutionMode) () from /tmp/pgsql/lib/libduckdb.so
#10 0x00007fc1b7ff8a09 in duckdb::ExecutorTask::Execute(duckdb::TaskExecutionMode) () from /tmp/pgsql/lib/libduckdb.so
#11 0x00007fc1b80007c2 in duckdb::TaskScheduler::ExecuteForever(std::atomic<bool>*) () from /tmp/pgsql/lib/libduckdb.so
#12 0x00007fc1bdad44a3 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#13 0x00007fc1bdca81f5 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#14 0x00007fc1bdd2889c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
not ok 1 + scan_postgres_tables 43534 ms
# (test process exited with exit code 2)
not ok 2 + scan_postgres_tables 43356 ms
# (test process exited with exit code 2)
not ok 3 + scan_postgres_tables 43494 ms
# (test process exited with exit code 2)
not ok 4 + scan_postgres_tables 43268 ms
# (test process exited with exit code 2)
not ok 5 + scan_postgres_tables 43540 ms
# (test process exited with exit code 2)
not ok 6 + scan_postgres_tables 43469 ms
# (test process exited with exit code 2)
not ok 7 + scan_postgres_tables 43469 ms
# (test process exited with exit code 2)
not ok 8 + scan_postgres_tables 46470 ms
# (test process exited with exit code 2)
not ok 9 + scan_postgres_tables 43533 ms
# (test process exited with exit code 2)
not ok 10 + scan_postgres_tables 43529 ms
# (test process exited with exit code 2)
not ok 11 + scan_postgres_tables 43532 ms
# (test process exited with exit code 2)
not ok 12 + scan_postgres_tables 43513 ms
# (test process exited with exit code 2)
not ok 13 + scan_postgres_tables 43537 ms
# (test process exited with exit code 2)
not ok 14 + scan_postgres_tables 43522 ms
# (test process exited with exit code 2)
not ok 15 + scan_postgres_tables 43451 ms
# (test process exited with exit code 2)
1..15
# 15 of 15 tests failed.
Not reproducible on my side either. By the way, how do you run scan_postgres_tables in parallel within a single installcheck command? The tests might conflict due to the same table name and setting the same GUC.
Hi @saygoodbyye I'm looking into this one. I've tried to compile PG with the same options you've provided but got:
configure: WARNING: unrecognized options: --enable-crash-info
I've checked the PG source code and I don't see any reference for this - am I missing something?
Thanks!
@Y--, This option is unnecessary, please do not pay attention to it
I have good hope that this has been fixed by #877. So closing this. If you can still reproduce, please reopen (or open a new issue).