redpanda icon indicating copy to clipboard operation
redpanda copied to clipboard

CI Failure (key symptom) in `PartitionBalancerScaleTest.test_node_operations_at_scale`

Open vbotbuildovich opened this issue 1 year ago • 2 comments

https://buildkite.com/redpanda/vtools/builds/13710

Module: rptest.scale_tests.partition_balancer_scale_test
Class: PartitionBalancerScaleTest
Method: test_node_operations_at_scale
Arguments: {
    "type": "many_partitions"
}
test_id:    PartitionBalancerScaleTest.test_node_operations_at_scale
status:     FAIL
run time:   942.224 seconds

<BadLogLines nodes=ip-172-31-15-130(4) example="WARN  2024-05-14 13:09:23,777 [shard 3:main] seastar_memory - oversized allocation: 458752 bytes. This is non-fatal, but could lead to latency and/or fragmentation issues. Please report: at 0x98db003 0x9526f0d 0x95337c8 0x972d0f8 0x9735f62 0x972a8fe 0x96a3af6 0x96172c8 0x96707b0 0x9585fcf /opt/redpanda/lib/libc.so.6+0x91016 /opt/redpanda/lib/libc.so.6+0x1166cf">
Traceback (most recent call last):
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 184, in _do_run
    data = self.run_test()
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 276, in run_test
    return self.test_context.function(self.test)
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/mark/_mark.py", line 535, in wrapper
    return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
  File "/home/ubuntu/redpanda/tests/rptest/services/cluster.py", line 188, in wrapped
    redpanda.raise_on_bad_logs(
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 1478, in raise_on_bad_logs
    lsearcher.search_logs(_searchable_nodes)
  File "/home/ubuntu/redpanda/tests/rptest/services/utils.py", line 197, in search_logs
    raise BadLogLines(bad_loglines)
rptest.services.utils.BadLogLines: <BadLogLines nodes=ip-172-31-15-130(4) example="WARN  2024-05-14 13:09:23,777 [shard 3:main] seastar_memory - oversized allocation: 458752 bytes. This is non-fatal, but could lead to latency and/or fragmentation issues. Please report: at 0x98db003 0x9526f0d 0x95337c8 0x972d0f8 0x9735f62 0x972a8fe 0x96a3af6 0x96172c8 0x96707b0 0x9585fcf /opt/redpanda/lib/libc.so.6+0x91016 /opt/redpanda/lib/libc.so.6+0x1166cf">

JIRA Link: CORE-2956

vbotbuildovich avatar May 14 '24 21:05 vbotbuildovich

Multiple oversized allocations in nightly build, all with same backtrace, decoded backtrace is:

[Backtrace #0]
void seastar::backtrace<seastar::current_backtrace_tasklocal()::$_0>(seastar::current_backtrace_tasklocal()::$_0&&) at /v/build/v_deps_build/seastar-prefix/src/seastar/include/seastar/util/backtrace.hh:68
 (inlined by) seastar::current_backtrace_tasklocal() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/util/backtrace.cc:97
 (inlined by) seastar::current_tasktrace() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/util/backtrace.cc:148
 (inlined by) seastar::current_backtrace() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/util/backtrace.cc:181
seastar::memory::cpu_pages::warn_large_allocation(unsigned long) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/memory.cc:849
 (inlined by) seastar::memory::cpu_pages::check_large_allocation(unsigned long) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/memory.cc:912
 (inlined by) seastar::memory::cpu_pages::allocate_large(unsigned int, bool) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/memory.cc:919
 (inlined by) seastar::memory::allocate_large(unsigned long, bool) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/memory.cc:1542
 (inlined by) seastar::memory::allocate_slowpath(unsigned long) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/memory.cc:1688
seastar::memory::allocate(unsigned long) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/memory.cc:1707
 (inlined by) operator new(unsigned long) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/memory.cc:2429
void* std::__1::__libcpp_operator_new[abi:v160004]<unsigned long>(unsigned long) at /vectorized/llvm/bin/../include/c++/v1/new:266
 (inlined by) std::__1::__libcpp_allocate[abi:v160004](unsigned long, unsigned long) at /vectorized/llvm/bin/../include/c++/v1/new:292
 (inlined by) std::__1::allocator<seastar::internal::pending_io_request>::allocate[abi:v160004](unsigned long) at /vectorized/llvm/bin/../include/c++/v1/__memory/allocator.h:115
 (inlined by) seastar::circular_buffer<seastar::internal::pending_io_request, std::__1::allocator<seastar::internal::pending_io_request> >::expand(unsigned long) at /v/build/v_deps_build/seastar-prefix/src/seastar/include/seastar/core/circular_buffer.hh:329
 (inlined by) seastar::circular_buffer<seastar::internal::pending_io_request, std::__1::allocator<seastar::internal::pending_io_request> >::expand() at /v/build/v_deps_build/seastar-prefix/src/seastar/include/seastar/core/circular_buffer.hh:323
 (inlined by) seastar::circular_buffer<seastar::internal::pending_io_request, std::__1::allocator<seastar::internal::pending_io_request> >::maybe_expand(unsigned long) at /v/build/v_deps_build/seastar-prefix/src/seastar/include/seastar/core/circular_buffer.hh:359
 (inlined by) void seastar::circular_buffer<seastar::internal::pending_io_request, std::__1::allocator<seastar::internal::pending_io_request> >::emplace_back<seastar::internal::io_request, seastar::io_completion*&>(seastar::internal::io_request&&, seastar::io_completion*&) at /v/build/v_deps_build/seastar-prefix/src/seastar/include/seastar/core/circular_buffer.hh:419
 (inlined by) seastar::internal::io_sink::submit(seastar::io_completion*, seastar::internal::io_request) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/io_queue.cc:416
 (inlined by) seastar::io_queue::submit_request(seastar::io_desc_read_write*, seastar::internal::io_request) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/io_queue.cc:1048
 (inlined by) seastar::queued_io_request::dispatch() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/io_queue.cc:308
 (inlined by) operator() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/io_queue.cc:1039
 (inlined by) decltype (((std::declval<seastar::io_queue::poll_io_queue()::$_0&>)())((std::declval<seastar::fair_queue_entry&>)())) std::__1::__invoke[abi:v160004]<seastar::io_queue::poll_io_queue()::$_0&, seastar::fair_queue_entry&>(seastar::io_queue::poll_io_queue()::$_0&, seastar::fair_queue_entry&) at /vectorized/llvm/bin/../include/c++/v1/__functional/invoke.h:394
 (inlined by) void std::__1::__invoke_void_return_wrapper<void, true>::__call<seastar::io_queue::poll_io_queue()::$_0&, seastar::fair_queue_entry&>(seastar::io_queue::poll_io_queue()::$_0&, seastar::fair_queue_entry&) at /vectorized/llvm/bin/../include/c++/v1/__functional/invoke.h:487
 (inlined by) std::__1::__function::__alloc_func<seastar::io_queue::poll_io_queue()::$_0, std::__1::allocator<seastar::io_queue::poll_io_queue()::$_0>, void (seastar::fair_queue_entry&)>::operator()[abi:v160004](seastar::fair_queue_entry&) at /vectorized/llvm/bin/../include/c++/v1/__functional/function.h:185
 (inlined by) std::__1::__function::__func<seastar::io_queue::poll_io_queue()::$_0, std::__1::allocator<seastar::io_queue::poll_io_queue()::$_0>, void (seastar::fair_queue_entry&)>::operator()(seastar::fair_queue_entry&) at /vectorized/llvm/bin/../include/c++/v1/__functional/function.h:356
std::__1::__function::__value_func<void (seastar::fair_queue_entry&)>::operator()[abi:v160004](seastar::fair_queue_entry&) const at /vectorized/llvm/bin/../include/c++/v1/__functional/function.h:510
 (inlined by) std::__1::function<void (seastar::fair_queue_entry&)>::operator()(seastar::fair_queue_entry&) const at /vectorized/llvm/bin/../include/c++/v1/__functional/function.h:1156
 (inlined by) seastar::fair_queue::dispatch_requests(std::__1::function<void (seastar::fair_queue_entry&)>) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/fair_queue.cc:383
seastar::io_queue::poll_io_queue() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/io_queue.cc:1038
seastar::reactor::flush_pending_aio() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:1897
 (inlined by) seastar::reactor::io_queue_submission_pollfn::poll() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:2963
seastar::reactor::poll_once() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:3537
 (inlined by) operator() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:3430
 (inlined by) seastar::reactor::do_run() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:3458
operator() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:4696
 (inlined by) decltype (((std::declval<seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_2&>)())()) std::__1::__invoke[abi:v160004]<seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_2&>(seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_2&) at /vectorized/llvm/bin/../include/c++/v1/__functional/invoke.h:394
 (inlined by) void std::__1::__invoke_void_return_wrapper<void, true>::__call<seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_2&>(seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_2&) at /vectorized/llvm/bin/../include/c++/v1/__functional/invoke.h:487
 (inlined by) std::__1::__function::__alloc_func<seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_2, std::__1::allocator<seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_2>, void ()>::operator()[abi:v160004]() at /vectorized/llvm/bin/../include/c++/v1/__functional/function.h:185
 (inlined by) std::__1::__function::__func<seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_2, std::__1::allocator<seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_2>, void ()>::operator()() at /vectorized/llvm/bin/../include/c++/v1/__functional/function.h:356
std::__1::__function::__value_func<void ()>::operator()[abi:v160004]() const at /vectorized/llvm/bin/../include/c++/v1/__functional/function.h:510
 (inlined by) std::__1::function<void ()>::operator()() const at /vectorized/llvm/bin/../include/c++/v1/__functional/function.h:1156
 (inlined by) seastar::posix_thread::start_routine(void*) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/posix.cc:90
opt/redpanda/lib/libc.so.6: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=6e7b96dfb83f0bdcb6a410469b82f86415e5ada3, for GNU/Linux 3.2.0, stripped

pthread_condattr_setpshared at ??:?
__xmknodat at ??:?

WillemKauf avatar May 16 '24 15:05 WillemKauf

This is an internal seastar oversized allocation, in the io_sink::_pending_io object (which is a ss::circular_buffer<pending_io_request>).

WillemKauf avatar May 16 '24 15:05 WillemKauf

seastar::internal::pending_io_request is 56 bytes, so this means that the iosink queue of resizing to 458752 can hold 8192 of them, i.e., there are > 4096 ios in flight in the iosink (somewhere between 4096 and 8192 because I assume it resizes via doubling).

This probably indicates that there are no either no io-properties deployed for this test, or the io is small enough that this many ios are still allowed into the sink. Maybe this could be a chunked_fifo instead of a circular_buffer.

travisdowns avatar Jun 23 '24 04:06 travisdowns

@travisdowns did you see my comment on slack the other day about this: https://redpandadata.slack.com/archives/C05FCNFRG1E/p1718702628858319?thread_ts=1718686896.486769&cid=C05FCNFRG1E

StephanDollberg avatar Jun 24 '24 08:06 StephanDollberg

@StephanDollberg I didn't but nice to see our analyses line up. I'll take a shot at swapping in chunked_fifo.

travisdowns avatar Jun 25 '24 01:06 travisdowns

This swap is not totally trivial because:

  • _pending_ios has operator[] used on it, which chunked_fifo can't support efficiently (but the use can be moved, it is actually being used in a forward iteration pattern, so there is no need for random access)
  • _pending_ios erases large chunks of itself (often all of itself), a pattern which is not supported in chunked_fifo, and the equivalent of "erase front" could be easily added but is not efficient currently and the gap is very large when the contained type T has a trivial destructor (as it does here, AFAICT), in that case it's O(N) vs O(1) cost

So I think I'll check with upstream to see if they'd accept this change before I do it (I'd improve the implementation of erase/clear to erase chunks at a time).

travisdowns avatar Jul 18 '24 21:07 travisdowns

Floating solutions upstream: https://github.com/scylladb/seastar/issues/2355

travisdowns avatar Jul 18 '24 22:07 travisdowns

How about something like this (or similar):

diff --git a/include/seastar/core/internal/io_sink.hh b/include/seastar/core/internal/io_sink.hh
index 2d0b8c67..711bd1c1 100644
--- a/include/seastar/core/internal/io_sink.hh
+++ b/include/seastar/core/internal/io_sink.hh
@@ -21,6 +21,9 @@

 #pragma once

+#include <boost/container/deque.hpp>
+#include <boost/container/options.hpp>
+
 #include <seastar/core/circular_buffer.hh>
 #include <seastar/core/internal/io_request.hh>

@@ -49,7 +52,9 @@ class pending_io_request : private internal::io_request {
 };

 class io_sink {
-    circular_buffer<pending_io_request> _pending_io;
+    using deque_opts = boost::container::deque_options_t<boost::container::block_bytes<16384u>>;
+    boost::container::deque<pending_io_request, void, deque_opts> _pending_io;
+
 public:
     void submit(io_completion* desc, internal::io_request req) noexcept;

Deque options were introduced in boost 1.71 (2019) so this shouldn't be an issue I think.

StephanDollberg avatar Jul 19 '24 09:07 StephanDollberg

Yeah that's a good idea, let me do that.

travisdowns avatar Jul 19 '24 15:07 travisdowns

I ended up going with the chunked_fifo approach as operator[] on boost::deque has terrible performance (as discussed elsewhere).

The result is: https://github.com/scylladb/seastar/pull/2357

travisdowns avatar Jul 19 '24 20:07 travisdowns

Recent x86 failure: https://buildkite.com/redpanda/vtools/builds/15600

travisdowns avatar Jul 26 '24 15:07 travisdowns

Per the discussion on the seastar issue above, it is possible that io properties would mitigate this issue in some cases, since the scheduler may limit the per-tick movement of ops from the io_queue into the io_sink, which could reduce the maximum number of ops in the io_sink.

Currently we are not passing any io-properties to Redpanda in the test_suite_ec2 tests that are failing, because we launch Redpanda directly, rather than via rpk start, and rpk start is the thing that detects the cloud provider and passes along the appropriate io properties. So we are running in "unlimited" mode.

I'm going to see what happens when we run using rpk instead.

travisdowns avatar Jul 26 '24 16:07 travisdowns

FWIW I did 20 runs of

and did not reproduce the 458K allocation but did get 2 instance of this half-sized:

WARN  2024-07-26 19:30:44,649 [shard 3:main] seastar_memory - oversized allocation: 229376 bytes. This is non-fatal, but could lead to latency and/or fragmentation issues. Please report: at 0xa4f4533 0xa139ddd 0xa146698 0xa3403a8 0xa349212 0xa33dbae 0xa2b6856 0xa22a038 0xa283520 0xa198e3f /opt/redpanda/lib/libc.so.6+0x961b6 /opt/redpanda/lib/libc.so.6+0x11839b

This is the same structure, just one resize earlier. Easier to reproduce.

travisdowns avatar Jul 26 '24 21:07 travisdowns

Closing older-bot-filed CI issues as we transition to a more reliable system.

piyushredpanda avatar Sep 24 '24 03:09 piyushredpanda