Transactions stuck in tx_errc::coordinator_not_available state
Version & Environment
Persisted issue, first it was running on 22.3 and then upgraded to 23.2.1-rc5 but the issue was still present.
INFO 2023-07-06 10:48:37,050 [shard 0] main - application.cc:347 - Redpanda v23.2.1-rc5 - b6e63b8af38ade6c059439d8fb0eee619b5410f6
INFO 2023-07-06 10:48:37,050 [shard 0] main - application.cc:355 - kernel=5.15.0-75-generic, nodename=redpanda-1, machine=aarch64
What went wrong?
Kafka client starting a transaction gets a reject:
WARN 2023-07-06 10:56:54,059 [shard 0] kafka - server.cc:1132 - failed to allocate pid, ec: tx_errc::coordinator_not_available
Redpanda was not in a recoverable state, restart for example did not resolve the issue.
Other logs that are visible are only:
TRACE 2023-07-06 10:56:23,557 [shard 0] tx - tx_gateway_frontend.cc:165 - leaving tm_stm's gate
TRACE 2023-07-06 10:56:23,557 [shard 0] tx - tx_gateway_frontend.cc:163 - entered tm_stm's gate
TRACE 2023-07-06 10:56:23,557 [shard 0] tx - tx_gateway_frontend.cc:165 - leaving tm_stm's gate
TRACE 2023-07-06 10:56:23,557 [shard 0] tx - tx_gateway_frontend.cc:163 - entered tm_stm's gate
TRACE 2023-07-06 10:56:23,557 [shard 0] tx - tx_gateway_frontend.cc:165 - leaving tm_stm's gate
TRACE 2023-07-06 10:56:23,557 [shard 0] tx - tx_gateway_frontend.cc:966 - waiting for {ns: {kafka_internal}, topic: {tx}} to fill metadata cache, retries left: 0
TRACE 2023-07-06 10:56:23,882 [shard 0] tx - tx_gateway_frontend.cc:163 - entered tm_stm's gate
TRACE 2023-07-06 10:56:23,882 [shard 0] tx - tx_gateway_frontend.cc:165 - leaving tm_stm's gate
TRACE 2023-07-06 10:56:23,882 [shard 0] tx - tx_gateway_frontend.cc:163 - entered tm_stm's gate
TRACE 2023-07-06 10:56:23,882 [shard 0] tx - tx_gateway_frontend.cc:165 - leaving tm_stm's gate
TRACE 2023-07-06 10:56:23,882 [shard 0] tx - tx_gateway_frontend.cc:163 - entered tm_stm's gate
TRACE 2023-07-06 10:56:23,882 [shard 0] tx - tx_gateway_frontend.cc:165 - leaving tm_stm's gate
TRACE 2023-07-06 10:56:23,882 [shard 0] tx - tx_gateway_frontend.cc:163 - entered tm_stm's gate
TRACE 2023-07-06 10:56:23,882 [shard 0] tx - tx_gateway_frontend.cc:165 - leaving tm_stm's gate
There are several exceptions but it is not clear of what they're about:
TRACE 2023-07-06 10:56:53,720 [shard 0] tx - tx_gateway_frontend.cc:163 - entered tm_stm's gate
TRACE 2023-07-06 10:56:53,720 [shard 0] tx - tx_gateway_frontend.cc:165 - leaving tm_stm's gate
TRACE 2023-07-06 10:56:53,720 [shard 0] tx - tx_gateway_frontend.cc:163 - entered tm_stm's gate
TRACE 2023-07-06 10:56:53,720 [shard 0] tx - tx_gateway_frontend.cc:165 - leaving tm_stm's gate
TRACE 2023-07-06 10:56:53,720 [shard 0] tx - tx_gateway_frontend.cc:966 - waiting for {ns: {kafka_internal}, topic: {tx}} to fill metadata cache, retries left: 0
TRACE 2023-07-06 10:56:53,905 [shard 0] exception - Throw exception at:
0x5e61067 0x5b59cef /opt/redpanda/lib/libc++abi.so.1+0x2f45b 0x21cefa3 0x279beef 0x5c25d8b 0x5ca425b 0x5ca573b 0x5c280cb 0x5c25fcf 0x5b4d50f 0x5b4c063 0x20e1913 0x5e98f13 /opt/redpanda/lib/libc.so.6+0x2b1c7 /opt/redpanda/lib/libc.so.6+0x2b29f 0x20dc2af
TRACE 2023-07-06 10:56:53,905 [shard 0] exception - Throw exception at:
0x5e61067 0x5b59cef /opt/redpanda/lib/libc++abi.so.1+0x2f45b 0x279c43f 0x21d7787 0x279bf9f 0x5c25d8b 0x5ca425b 0x5ca573b 0x5c280cb 0x5c25fcf 0x5b4d50f 0x5b4c063 0x20e1913 0x5e98f13 /opt/redpanda/lib/libc.so.6+0x2b1c7 /opt/redpanda/lib/libc.so.6+0x2b29f 0x20dc2af
TRACE 2023-07-06 10:56:53,905 [shard 0] exception - Throw exception at:
0x5e61067 0x5b59cef /opt/redpanda/lib/libc++abi.so.1+0x2f8a7 /opt/redpanda/lib/libc++.so.1+0x52cf7 0x5b69907 0x5383537 0x21c46ff 0x5c257d3 0x5c27e2b 0x5c25fcf 0x5b4d50f 0x5b4c063 0x20e1913 0x5e98f13 /opt/redpanda/lib/libc.so.6+0x2b1c7 /opt/redpanda/lib/libc.so.6+0x2b29f 0x20dc2af
--------
seastar::internal::coroutine_traits_base<void>::promise_type
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<void>::finally_body<auto seastar::internal::invoke_func_with_gate<storage::log_manager::start()::$_0>(seastar::gate&, storage::log_manager::start()::$_0&&)::'lambda'(), false>, seastar::futurize<storage::log_manager::start()::$_0>::type seastar::future<void>::then_wrapped
_nrvo<seastar::future<void>, seastar::future<void>::finally_body<auto seastar::internal::invoke_func_with_gate<storage::log_manager::start()::$_0>(seastar::gate&, storage::log_manager::start()::$_0&&)::'lambda'(), false>>(seastar::future<void>::finally_body<auto seastar::internal::invoke_func_with_gate<storage::log_manager::start()::$_0>(seastar::gate&, storag
e::log_manager::start()::$_0&&)::'lambda'(), false>&&)::'lambda'(seastar::internal::promise_base_with_type<void>&&, seastar::future<void>::finally_body<auto seastar::internal::invoke_func_with_gate<storage::log_manager::start()::$_0>(seastar::gate&, storage::log_manager::start()::$_0&&)::'lambda'(), false>&, seastar::future_state<seastar::internal::monostate>&
&), void>
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<void> seastar::future<void>::handle_exception_type<ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::abort_requested_exception const&)>(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::abort_requested_exception const&)&&)::
'lambda'(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::abort_requested_exception const&)&&), seastar::futurize<ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::abort_requested_exception const&)>::type seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::future<void> seastar::future<void>:
:handle_exception_type<ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::abort_requested_exception const&)>(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::abort_requested_exception const&)&&)::'lambda'(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::abort_requested_exception const&)&&
)>(seastar::future<void> seastar::future<void>::handle_exception_type<ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::abort_requested_exception const&)>(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::abort_requested_exception const&)&&)::'lambda'(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambd
a'(seastar::abort_requested_exception const&)&&)&&)::'lambda'(seastar::internal::promise_base_with_type<void>&&, seastar::future<void> seastar::future<void>::handle_exception_type<ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::abort_requested_exception const&)>(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar:
:abort_requested_exception const&)&&)::'lambda'(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::abort_requested_exception const&)&&)&, seastar::future_state<seastar::internal::monostate>&&), void>
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<void> seastar::future<void>::handle_exception_type<ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::gate_closed_exception const&)>(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::gate_closed_exception const&)&&)::'lambda'
(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::gate_closed_exception const&)&&), seastar::futurize<ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::gate_closed_exception const&)>::type seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::future<void> seastar::future<void>::handle_exceptio
n_type<ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::gate_closed_exception const&)>(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::gate_closed_exception const&)&&)::'lambda'(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::gate_closed_exception const&)&&)>(seastar::future<void> sea
star::future<void>::handle_exception_type<ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::gate_closed_exception const&)>(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::gate_closed_exception const&)&&)::'lambda'(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::gate_closed_exception co
nst&)&&)&&)::'lambda'(seastar::internal::promise_base_with_type<void>&&, seastar::future<void> seastar::future<void>::handle_exception_type<ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::gate_closed_exception const&)>(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::gate_closed_exception const&)&&)::'lambda'(
ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::gate_closed_exception const&)&&)&, seastar::future_state<seastar::internal::monostate>&&), void>
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<void> seastar::future<void>::handle_exception_type<ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_semaphore const&)>(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_semaphore const&)&&)::'lambda'(ssx::igno
re_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_semaphore const&)&&), seastar::futurize<ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_semaphore const&)>::type seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::future<void> seastar::future<void>::handle_exception_type<ssx::ignore_s
hutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_semaphore const&)>(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_semaphore const&)&&)::'lambda'(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_semaphore const&)&&)>(seastar::future<void> seastar::future<void>::handle_exceptio
n_type<ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_semaphore const&)>(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_semaphore const&)&&)::'lambda'(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_semaphore const&)&&)&&)::'lambda'(seastar::internal::promise_b
ase_with_type<void>&&, seastar::future<void> seastar::future<void>::handle_exception_type<ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_semaphore const&)>(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_semaphore const&)&&)::'lambda'(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lam
bda'(seastar::broken_semaphore const&)&&)&, seastar::future_state<seastar::internal::monostate>&&), void>
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<void> seastar::future<void>::handle_exception_type<ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_promise const&)>(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_promise const&)&&)::'lambda'(ssx::ignore_s
hutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_promise const&)&&), seastar::futurize<ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_promise const&)>::type seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::future<void> seastar::future<void>::handle_exception_type<ssx::ignore_shutdown_
exceptions(seastar::future<void>)::'lambda'(seastar::broken_promise const&)>(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_promise const&)&&)::'lambda'(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_promise const&)&&)>(seastar::future<void> seastar::future<void>::handle_exception_type<ssx::ig
nore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_promise const&)>(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_promise const&)&&)::'lambda'(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_promise const&)&&)&&)::'lambda'(seastar::internal::promise_base_with_type<void>&
&, seastar::future<void> seastar::future<void>::handle_exception_type<ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_promise const&)>(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_promise const&)&&)::'lambda'(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_pro
mise const&)&&)&, seastar::future_state<seastar::internal::monostate>&&), void>
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<void> seastar::future<void>::handle_exception_type<ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_condition_variable const&)>(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_condition_variable const&)&&)::
'lambda'(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_condition_variable const&)&&), seastar::futurize<ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_condition_variable const&)>::type seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::future<void> seastar::future<void>:
:handle_exception_type<ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_condition_variable const&)>(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_condition_variable const&)&&)::'lambda'(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_condition_variable const&)&&
)>(seastar::future<void> seastar::future<void>::handle_exception_type<ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_condition_variable const&)>(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_condition_variable const&)&&)::'lambda'(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambd
a'(seastar::broken_condition_variable const&)&&)&&)::'lambda'(seastar::internal::promise_base_with_type<void>&&, seastar::future<void> seastar::future<void>::handle_exception_type<ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_condition_variable const&)>(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar:
:broken_condition_variable const&)&&)::'lambda'(ssx::ignore_shutdown_exceptions(seastar::future<void>)::'lambda'(seastar::broken_condition_variable const&)&&)&, seastar::future_state<seastar::internal::monostate>&&), void>
TRACE 2023-07-06 10:56:53,905 [shard 0] storage-gc - disk_log_impl.cc:762 - [{redpanda/controller/0}] house keeping with configuration from manager: {compact:{max_collectible_offset:9223372036854775807, should_sanitize:{nullopt}}, gc:{eviction_time:{timestamp: 1688036213905}, max_bytes:18446744073709551615}}
TRACE 2023-07-06 10:56:53,905 [shard 0] storage-gc - disk_log_impl.cc:823 - [{redpanda/controller/0}] applying 'deletion' log cleanup policy with config: {eviction_time:{timestamp: 1688036213905}, max_bytes:18446744073709551615}
Raw /var/lib/redpanda/data can be provided if needed.
This issue hasn't seen activity in 3 months. If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in two weeks.
This issue was closed due to lack of activity. Feel free to reopen if it's still relevant.