yugabyte-db
yugabyte-db copied to clipboard
[YSQL] AbortTransaction PG core dump occurs in various scenarios
Jira Link: DB-6447
Description
We observe this issue in various scenarios in stress suites already.
(lldb) target create "/home/yugabyte/yb-software/yugabyte-2.19.0.0-b114-centos-x86_64/postgres/bin/postgres" --core "/home/yugabyte/cores/core_14000_1682836085_!home!yugabyte!yb-software!yugabyte-2.19.0.0-b114-centos-x86_64!postgres!bin!postgres"
Core file '/home/yugabyte/cores/core_14000_1682836085_!home!yugabyte!yb-software!yugabyte-2.19.0.0-b114-centos-x86_64!postgres!bin!postgres' (x86_64) was loaded.
(lldb) bt all
warning: This version of LLDB has no plugin for the language "assembler". Inspection of frame variables will be limited.
* thread #1, name = 'postgres', stop reason = signal SIGABRT
* frame #0: 0x00007f66a1c310a7 libc.so.6`__GI_raise(sig=6) at raise.c:54
frame #1: 0x00007f66a1c324aa libc.so.6`__GI_abort at abort.c:89
frame #2: 0x000055e9f60772ba postgres`errfinish(dummy=<unavailable>) at elog.c:815:3
frame #3: 0x000055e9f607f729 postgres`elog_start(filename="", lineno=2744, funcname=<unavailable>) at elog.c:1698:3
frame #4: 0x000055e9f5a866c3 postgres`AbortTransaction at xact.c:2743:3
frame #5: 0x000055e9f5a89b2a postgres`AbortCurrentTransaction at xact.c:0:4
frame #6: 0x000055e9f5ecca8e postgres`PostgresMain(argc=<unavailable>, argv=<unavailable>, dbname=<unavailable>, username=<unavailable>) at postgres.c:5024:3
frame #7: 0x000055e9f5e0f5de postgres`BackendRun(port=0x000055e9f91121e0) at postmaster.c:4676:2
frame #8: 0x000055e9f5e0e6a0 postgres`ServerLoop [inlined] BackendStartup(port=0x000055e9f91121e0) at postmaster.c:4314:3
frame #9: 0x000055e9f5e0e61a postgres`ServerLoop at postmaster.c:1774:7
frame #10: 0x000055e9f5e09b15 postgres`PostmasterMain(argc=23, argv=0x000055e9f91283c0) at postmaster.c:1430:11
frame #11: 0x000055e9f5d0ef0f postgres`PostgresServerProcessMain(argc=23, argv=0x000055e9f91283c0) at main.c:234:3
frame #12: 0x000055e9f59d60b2 postgres`main + 34
frame #13: 0x00007f66a1c1e825 libc.so.6`__libc_start_main(main=(postgres`main), argc=23, argv=0x00007ffe00acad88, init=<unavailable>, fini=<unavailable>, rtld_fini=<unavailable>, stack_end=0x00007ffe00acad78) at libc-start.c:289
frame #14: 0x000055e9f59d5fc9 postgres`_start at start.S:108
thread #2, stop reason = signal 0
frame #0: 0x00007f66a25ac3b8 libpthread.so.0`pthread_cond_timedwait@@GLIBC_2.3.2 at pthread_cond_timedwait.S:225
frame #1: 0x00007f66a288e29b libc++.so.1`std::__1::condition_variable::__do_timed_wait(std::__1::unique_lock<std::__1::mutex>&, std::__1::chrono::time_point<std::__1::chrono::system_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>) + 91
frame #2: 0x00007f669df1f840 libyb_util.so`yb::(anonymous namespace)::LongOperationTrackerHelper::Execute() [inlined] std::__1::cv_status std::__1::condition_variable::wait_for<long long, std::__1::ratio<1l, 1000000000l>>(this=0x00007f669e0386f8, __lk=0x00007f6690015298, __d=<unavailable>) at __mutex_base:0:72
frame #3: 0x00007f669df1f7b9 libyb_util.so`yb::(anonymous namespace)::LongOperationTrackerHelper::Execute(this=0x00007f669e0386b0) at long_operation_tracker.cc:111:19
frame #4: 0x00007f669e00440c libyb_util.so`yb::Thread::SuperviseThread(void*) [inlined] std::__1::__function::__value_func<void ()>::operator(this=0x000055e9f911f5c0)[abi:v15007]() const at function.h:512:16
frame #5: 0x00007f669e0043f6 libyb_util.so`yb::Thread::SuperviseThread(void*) [inlined] std::__1::function<void ()>::operator(this=0x000055e9f911f5c0)() const at function.h:1197:12
frame #6: 0x00007f669e0043f6 libyb_util.so`yb::Thread::SuperviseThread(arg=0x000055e9f911f560) at thread.cc:842:3
frame #7: 0x00007f66a25a7694 libpthread.so.0`start_thread(arg=0x00007f669001d700) at pthread_create.c:333
frame #8: 0x00007f66a1ce441d libc.so.6`__clone at clone.S:109
Warning: Please confirm that this issue does not contain any sensitive information
- [X] I confirm this issue does not contain any sensitive information.
Observing this issue more frequently on master runs (2.19.1) for different testcases :
-
test_intensive_multi_tenancy_workload
for version 2.19.1.0-b363 -
test_ysql_bank_operations_pessimistic_lock
for version 2.19.1.0-b363 -
test_create_alter_delete_tables_vm_restarts
on versions 2.19.1.0-b379 and 2.19.1.0-b389
cc: @robertsami
Observed this in my packed toggle off and on case as well.
test_sql_packed_columns_toggle_on_and_off
Observing this issue with other workloads as well.
Recent failure on 2.19.3.0-b53
on test_intensive_multi_tenancy_workload
testcase which runs SqlIntensiveConsistencyDDL workload
Observing this coredump for test_create_alter_delete_tables_vm_restarts
testcase on 2.14.14, 2.16.8, 2.18.4 and 2.20.0 runs now. Failing consistently on all runs with this.
Possibly duplicate of https://github.com/yugabyte/yugabyte-db/issues/18192
Observed this issue in 2.14.16.0-b3 as well, test: test_ysql_tablet_split_ps_restarts