yugabyte-db
yugabyte-db copied to clipboard
[CDCSDK] CDC + CQL: Investigate Out-of-Memory (OOM) Issue and tserver crash, Check MemTracker for CDC Records and Overhead Memory
Jira Link: DB-9225
Description
Please find slack conversation and stress links in JIRA.
Encountered Out-of-Memory (OOM) issue leading to a system crash. Upon consulting with the development team, it was suggested to investigate the MemTracker for CDC (Change Data Capture) records. Additionally, a thorough examination of the MemTracker is required to identify any known tracked memory that might be causing overhead, contributing to the reported issue. This ticket is created to address and resolve the identified memory-related issues and prevent future occurrences.
* thread #1, name = 'yb-tserver', stop reason = signal SIGABRT
* frame #0: 0x00007f8e1ef2d0a7 libc.so.6`__GI_raise(sig=6) at raise.c:54
frame #1: 0x00007f8e1ef2e4aa libc.so.6`__GI_abort at abort.c:89
frame #2: 0x000056164dd026c8 yb-tserver`tcmalloc::tcmalloc_internal::Crash(tcmalloc::tcmalloc_internal::CrashMode, char const*, int, tcmalloc::tcmalloc_internal::LogItem, tcmalloc::tcmalloc_internal::LogItem, tcmalloc::tcmalloc_internal::LogItem, tcmalloc::tcmalloc_internal::LogItem, tcmalloc::tcmalloc_internal::LogItem, tcmalloc::tcmalloc_internal::LogItem) + 312
frame #3: 0x000056164dd0863b yb-tserver`tcmalloc::tcmalloc_internal::CppOomPolicy::handle_oom(unsigned long) + 123
frame #4: 0x000056164dcc7eaa yb-tserver`void* slow_alloc<tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::CppOomPolicy, tcmalloc::tcmalloc_internal::DefaultAlignPolicy, tcmalloc::tcmalloc_internal::AllocationAccessHotPolicy, tcmalloc::tcmalloc_internal::InvokeHooksPolicy, tcmalloc::tcmalloc_internal::LocalNumaPartitionPolicy>, std::nullptr_t>(tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::CppOomPolicy, tcmalloc::tcmalloc_internal::DefaultAlignPolicy, tcmalloc::tcmalloc_internal::AllocationAccessHotPolicy, tcmalloc::tcmalloc_internal::InvokeHooksPolicy, tcmalloc::tcmalloc_internal::LocalNumaPartitionPolicy>, unsigned long, std::nullptr_t) + 794
frame #5: 0x000056164dcc799e yb-tserver`TCMallocInternalNew + 414
frame #6: 0x000056164bad7685 yb-tserver`std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::push_back(char) + 165
frame #7: 0x000056164bac1e9d yb-tserver`std::__1::basic_stringbuf<char, std::__1::char_traits<char>, std::__1::allocator<char>>::overflow(int) + 109
frame #8: 0x000056164badd8ef yb-tserver`std::__1::basic_ostream<char, std::__1::char_traits<char>>::put(char) + 111
frame #9: 0x000056164d409b6e yb-tserver`rapidjson::Writer<yb::UTF8StringStreamBuffer, rapidjson::UTF8<char>, rapidjson::UTF8<char>, rapidjson::CrtAllocator, 0u>::WriteString(this=0x000034f8b5fc42b0, str="[599bbf91-482c-4a31-a1d1-c905ccc0efeb:51121:NRRuuyRJFMQKvFMtLJFtJwJFvRINvvvIxyxuOSSvvzSKGNRLwGNuMKGuKxKGwSJOwwwJyzyvPTTwwATLHOSMxHOvNLHvLyLHxTKPxxxKzAzw:seed, 599bbf91-482c-4a31-a1d1-c905ccc0efeb:val:51121:seed:NRRuuyRJFMQKvFMtLJFtJwJFvRINvvvIxyxuOSSvvzSKGNRLwGNuMKGuKxKGwSJOwwwJyzyvPTTwwATLHOSMxHOvNLHvLyLHxTKPxxxKzAzw, 599bbf91-482c-4a31-a1d1-c905ccc0efeb:val:51121:seed:NRRuuyRJFMQKvFMtLJFtJwJFvRINvvvIxyxuOSSvvzSKGNRLwGNuMKGuKxKGwSJOwwwJyzyvPTTwwATLHOSMxHOvNLHvLyLHxTKPxxxKzAzw, 599bbf91-482c-4a31-a1d1-c905ccc0efeb:val:51121:seed:NRRuuyRJFMQKvFMtLJFtJwJFvRINvvvIxyxuOSSvvzSKGNRLwGNuMKGuKxKGwSJOwwwJyzyvPTTwwATLHOSMxHOvNLHvLyLHxTKPxxxKzAzw, 599bbf91-482c-4a31-a1d1-c905ccc0efeb:val:51121:seed:NRRuuyRJFMQKvFMtLJFtJwJFvRINvvvIxyxuOSSvvzSKGNRLwGNuMKGuKxKGwSJOwwwJyzyvPTTwwATLHOSMxHOvNLHvLyLHxTKPxxxKzAzw, 599bbf91-482c-4a31-a1d1-c905ccc0efeb:val:51121:seed:NRRuuyRJFMQKvFMtLJFtJwJFvRINvvvIxyxuOSSvvzSKGNRLwGNuMKGuKxKGwSJOwwwJyzyvPTTwwATLHOSMxHOvNLHvLyLHxTKPxxxKzAzw, 599bbf91-482c-4a31-a1d1-c905ccc0efeb:val:51121:seed:NRR"..., length=4096) at writer.h:0
frame #10: 0x000056164d40b1fe yb-tserver`yb::JsonWriter::Protobuf(google::protobuf::Message const&) [inlined] yb::JsonWriter::String(this=0x00007f8e11890910, str="[599bbf91-482c-4a31-a1d1-c905ccc0efeb:51121:NRRuuyRJFMQKvFMtLJFtJwJFvRINvvvIxyxuOSSvvzSKGNRLwGNuMKGuKxKGwSJOwwwJyzyvPTTwwATLHOSMxHOvNLHvLyLHxTKPxxxKzAzw:seed, 599bbf91-482c-4a31-a1d1-c905ccc0efeb:val:51121:seed:NRRuuyRJFMQKvFMtLJFtJwJFvRINvvvIxyxuOSSvvzSKGNRLwGNuMKGuKxKGwSJOwwwJyzyvPTTwwATLHOSMxHOvNLHvLyLHxTKPxxxKzAzw, 599bbf91-482c-4a31-a1d1-c905ccc0efeb:val:51121:seed:NRRuuyRJFMQKvFMtLJFtJwJFvRINvvvIxyxuOSSvvzSKGNRLwGNuMKGuKxKGwSJOwwwJyzyvPTTwwATLHOSMxHOvNLHvLyLHxTKPxxxKzAzw, 599bbf91-482c-4a31-a1d1-c905ccc0efeb:val:51121:seed:NRRuuyRJFMQKvFMtLJFtJwJFvRINvvvIxyxuOSSvvzSKGNRLwGNuMKGuKxKGwSJOwwwJyzyvPTTwwATLHOSMxHOvNLHvLyLHxTKPxxxKzAzw, 599bbf91-482c-4a31-a1d1-c905ccc0efeb:val:51121:seed:NRRuuyRJFMQKvFMtLJFtJwJFvRINvvvIxyxuOSSvvzSKGNRLwGNuMKGuKxKGwSJOwwwJyzyvPTTwwATLHOSMxHOvNLHvLyLHxTKPxxxKzAzw, 599bbf91-482c-4a31-a1d1-c905ccc0efeb:val:51121:seed:NRRuuyRJFMQKvFMtLJFtJwJFvRINvvvIxyxuOSSvvzSKGNRLwGNuMKGuKxKGwSJOwwwJyzyvPTTwwATLHOSMxHOvNLHvLyLHxTKPxxxKzAzw, 599bbf91-482c-4a31-a1d1-c905ccc0efeb:val:51121:seed:NRR"...) at jsonwriter.cc:200:53
frame #11: 0x000056164d40b1f0 yb-tserver`yb::JsonWriter::Protobuf(google::protobuf::Message const&) [inlined] yb::JsonWriter::ProtobufField(this=0x00007f8e11890910, pb=0x000034f89de21840, field=0x000034f779d3bd50) at jsonwriter.cc:285:7
frame #12: 0x000056164d40b00c yb-tserver`yb::JsonWriter::Protobuf(this=0x00007f8e11890910, pb=0x000034f89de21840) at jsonwriter.cc:251:7
frame #13: 0x000056164d40ae60 yb-tserver`yb::JsonWriter::ProtobufRepeatedField(this=0x00007f8e11890910, pb=0x000034f89ddc8140, field=0x000034f779d37be8, index=141) at jsonwriter.cc:326:7
frame #14: 0x000056164d40b002 yb-tserver`yb::JsonWriter::Protobuf(this=0x00007f8e11890910, pb=0x000034f89ddc8140) at jsonwriter.cc:247:9
frame #15: 0x000056164d40b236 yb-tserver`yb::JsonWriter::Protobuf(google::protobuf::Message const&) [inlined] yb::JsonWriter::ProtobufField(this=0x00007f8e11890910, pb=0x000034f77b2854a0, field=0x000034f77e755978) at jsonwriter.cc:288:7
frame #16: 0x000056164d40b00c yb-tserver`yb::JsonWriter::Protobuf(this=0x00007f8e11890910, pb=0x000034f77b2854a0) at jsonwriter.cc:251:7
frame #17: 0x000056164d40ae60 yb-tserver`yb::JsonWriter::ProtobufRepeatedField(this=0x00007f8e11890910, pb=0x000034f8b5fc5e60, field=0x000034f77d8eaaf8, index=11) at jsonwriter.cc:326:7
frame #18: 0x000056164d40b002 yb-tserver`yb::JsonWriter::Protobuf(this=0x00007f8e11890910, pb=0x000034f8b5fc5e60) at jsonwriter.cc:247:9
frame #19: 0x000056164d40ae60 yb-tserver`yb::JsonWriter::ProtobufRepeatedField(this=0x00007f8e11890910, pb=0x00007f8e11890860, field=0x000034f779d37a20, index=43) at jsonwriter.cc:326:7
frame #20: 0x000056164d40b002 yb-tserver`yb::JsonWriter::Protobuf(this=0x00007f8e11890910, pb=0x00007f8e11890860) at jsonwriter.cc:247:9
frame #21: 0x000056164cd5117b yb-tserver`yb::(anonymous namespace)::RpczPathHandler(messenger=<unavailable>, req=<unavailable>, resp=0x00007f8e11890960) at rpcz-path-handler.cc:77:10
frame #22: 0x000056164cd724c8 yb-tserver`yb::Webserver::Impl::BeginRequestCallbackStatic(sq_connection*) [inlined] std::__1::__function::__value_func<void (yb::WebCallbackRegistry::WebRequest const&, yb::WebCallbackRegistry::WebResponse*)>::operator(this=0x000034f77ed78080, __args=0x00007f8e11892a88, __args=0x00007f8e11892970)[abi:v160006](yb::WebCallbackRegistry::WebRequest const&, yb::WebCallbackRegistry::WebResponse*&&) const at function.h:510:16
frame #23: 0x000056164cd724ae yb-tserver`yb::Webserver::Impl::BeginRequestCallbackStatic(sq_connection*) [inlined] std::__1::function<void (yb::WebCallbackRegistry::WebRequest const&, yb::WebCallbackRegistry::WebResponse*)>::operator(this=0x000034f77ed78080, __arg=0x00007f8e11892a88, __arg=0x00007f8e11890960)(yb::WebCallbackRegistry::WebRequest const&, yb::WebCallbackRegistry::WebResponse*) const at function.h:1156:12
frame #24: 0x000056164cd724ae yb-tserver`yb::Webserver::Impl::BeginRequestCallbackStatic(sq_connection*) at webserver.cc:625:5
frame #25: 0x000056164cd72198 yb-tserver`yb::Webserver::Impl::BeginRequestCallbackStatic(sq_connection*) [inlined] yb::Webserver::Impl::BeginRequestCallback(this=0x000034f77efd5400, connection=0x000034f778b61000, request_info=0x000034f778b61000) at webserver.cc:557:10
frame #26: 0x000056164cd7173b yb-tserver`yb::Webserver::Impl::BeginRequestCallbackStatic(connection=0x000034f778b61000) at webserver.cc:532:20
frame #27: 0x000056164cd78897 yb-tserver`process_new_connection + 6119
frame #28: 0x000056164cd76f4a yb-tserver`worker_thread + 906
frame #29: 0x00007f8e1ece3694 libpthread.so.0`start_thread(arg=0x00007f8e118a2700) at pthread_create.c:333
frame #30: 0x00007f8e1efe041d libc.so.6`__clone at clone.S:109
Source connector version
1.9.5.y.34-SNAPSHOT
Connector configuration
add connector connector_name='ybconnector_cdc_854a31_test_cdc_fde45b_test_cdc_284a34' stream_id='cc16250f4ef347fa95d5c9739694821d' db_name='cdc_854a31' connector_host='172.151.31.22' table_list=['test_cdc_fde45b', 'test_cdc_284a34'] {'name': 'ybconnector_cdc_854a31_test_cdc_fde45b_test_cdc_284a34', 'config': {'connector.class': 'io.debezium.connector.yugabytedb.YugabyteDBConnector', 'database.hostname': '172.151.23.98:5433,172.151.19.84:5433,172.151.24.15:5433', 'database.master.addresses': '172.151.23.98:7100,172.151.19.84:7100,172.151.24.15:7100', 'database.port': 9042, 'database.masterhost': '172.151.23.98', 'database.masterport': '7100', 'database.user': 'cassandra', 'database.password': 'cassandra', 'database.dbname': 'cdc_854a31', 'database.server.name': 'db_cdc', 'database.streamid': 'cc16250f4ef347fa95d5c9739694821d', 'snapshot.mode': 'never', 'admin.operation.timeout.ms': 600000, 'socket.read.timeout.ms': 300000, 'max.connector.retries': '10', 'operation.timeout.ms': 600000, 'topic.creation.default.compression.type':
YugabyteDB version
2.20.2.0-b10
Issue Type
kind/bug
Warning: Please confirm that this issue does not contain any sensitive information
- [X] I confirm this issue does not contain any sensitive information.