TSeer
TSeer copied to clipboard
TSeerServer不稳定 core
目前试用中还是会遇到 TSeerServer 不稳定 core 的问题,目前还没有找出稳定复现的条件,会继续尝试。 现将 core 信息贴出,期望能获得一些有效信息,最终解决这个问题:
core文件1:
#0 0x00007f95ce5ccec5 in _IO_vfscanf_internal () from /lib64/libc.so.6
#1 0x00007f95ce5e1685 in vsscanf () from /lib64/libc.so.6
#2 0x00007f95ce5db6e8 in sscanf () from /lib64/libc.so.6
#3 0x00007f95ce615152 in __tzset_parse_tz () from /lib64/libc.so.6
#4 0x00007f95ce61632e in __tzfile_compute () from /lib64/libc.so.6
#5 0x00007f95ce615cb7 in __tz_convert () from /lib64/libc.so.6
#6 0x000000000046eba1 in tars::TC_Logger<tars::RollWriteT, tars::TC_RollBySize>::stream(int) ()
at /home/tcheng/tools/TSeer/thirdparty/tars/include/util/tc_logger.h:727
#7 0x000000000046ed3e in tars::TC_Logger<tars::RollWriteT, tars::TC_RollBySize>::debug() ()
at /home/tcheng/tools/TSeer/thirdparty/tars/include/util/tc_logger.h:689
#8 0x00000000004957ec in RequestEtcdCallback::responseClient(int, long, std::vector<Tseer::RouterData, std::allocator<Tseer::RouterData> > const&) () at /home/tcheng/tools/TSeer/TseerServer/src/RequestEtcdCallback.cpp:1299
#9 0x00000000004a1b8c in RequestEtcdCallback::onResponse(bool, tars::TC_HttpResponse&) ()
at /home/tcheng/tools/TSeer/TseerServer/src/RequestEtcdCallback.cpp:168
#10 0x0000000000609a81 in tars::TC_HttpAsync::AsyncRequest::doReceive() ()
at /home/tcheng/tools/TSeer/build/Tars/cpp/util/src/tc_http_async.cpp:264
#11 0x0000000000609be1 in tars::TC_HttpAsync::process(tars::TC_AutoPtr<tars::TC_HttpAsync::AsyncRequest>&, int) ()
at /home/tcheng/tools/TSeer/build/Tars/cpp/util/src/tc_http_async.cpp:462
#12 0x0000000000609d2d in tars::TC_HttpAsync::run() () at /home/tcheng/tools/TSeer/build/Tars/cpp/util/src/tc_http_async.cpp:505
#13 0x0000000000610acf in tars::TC_ThreadPool::ThreadWorker::run() ()
at /home/tcheng/tools/TSeer/build/Tars/cpp/util/src/tc_thread_pool.cpp:60
#14 0x00000000005f7e7a in tars::TC_Thread::threadEntry(tars::TC_Thread*) ()
at /home/tcheng/tools/TSeer/build/Tars/cpp/util/src/tc_thread.cpp:93
#15 0x00007f95cf0b6aa1 in start_thread () from /lib64/libpthread.so.0
#16 0x00007f95ce660aad in clone () from /lib64/libc.so.6
core 文件2:
#0 std::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(char const*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
at /usr/local/include/c++/4.8.3/bits/basic_string.h:716
#1 0x00000000004a314a in EtcdReqStr(EtcdReqestInfo const&) () at /home/tcheng/tools/TSeer/TseerServer/src/RequestEtcdCallback.h:284
#2 0x00000000004957af in RequestEtcdCallback::responseClient(int, long, std::vector<Tseer::RouterData, std::allocator<Tseer::RouterData> > const&) () at /home/tcheng/tools/TSeer/TseerServer/src/RequestEtcdCallback.cpp:1299
#3 0x0000000000499291 in RequestEtcdCallback::doGetSeerAgentResponse(rapidjson::GenericDocument<rapidjson::UTF8<char>, rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator>, rapidjson::CrtAllocator> const&) ()
at /home/tcheng/tools/TSeer/TseerServer/src/RequestEtcdCallback.cpp:316
#4 0x00000000004a207b in RequestEtcdCallback::onResponse(bool, tars::TC_HttpResponse&) ()
at /home/tcheng/tools/TSeer/TseerServer/src/RequestEtcdCallback.cpp:120
#5 0x0000000000609a81 in tars::TC_HttpAsync::AsyncRequest::doReceive() ()
at /home/tcheng/tools/TSeer/build/Tars/cpp/util/src/tc_http_async.cpp:264
#6 0x0000000000609be1 in tars::TC_HttpAsync::process(tars::TC_AutoPtr<tars::TC_HttpAsync::AsyncRequest>&, int) ()
at /home/tcheng/tools/TSeer/build/Tars/cpp/util/src/tc_http_async.cpp:462
#7 0x0000000000609d2d in tars::TC_HttpAsync::run() () at /home/tcheng/tools/TSeer/build/Tars/cpp/util/src/tc_http_async.cpp:505
#8 0x0000000000610acf in tars::TC_ThreadPool::ThreadWorker::run() ()
at /home/tcheng/tools/TSeer/build/Tars/cpp/util/src/tc_thread_pool.cpp:60
#9 0x00000000005f7e7a in tars::TC_Thread::threadEntry(tars::TC_Thread*) ()
at /home/tcheng/tools/TSeer/build/Tars/cpp/util/src/tc_thread.cpp:93
#10 0x00007f0febe71aa1 in start_thread () from /lib64/libpthread.so.0
#11 0x00007f0feb41baad in clone () from /lib64/libc.so.6
入口都是 tars::TC_HttpAsync::AsyncRequest::doReceive()
,中间路径略有由不同,看起来是字符串的处理上有点儿问题,但简单 debug 了一下没有找出原因。
/home/tcheng/tools/TSeer/TseerServer/src/RequestEtcdCallback.cpp:1299 就是一个日志输出语句:
ETCDPROC_LOG << ETCDFILE_FUN << "|response,ret= " << ret << "|retryTime= " << retryTime << endl;
其中ETCDFILE_FUN是一个宏定义:
#define ETCDFILE_FUN FILE_FUN <<EtcdReqStr(_etcdReqInfo)<<"|"
而EtcdReqStr这个函数,就是一个字符串拼接:
273 inline string EtcdReqStr(const EtcdReqestInfo& etcdReqInfo)
274 {
275 string client;
276 if (etcdReqInfo.current)
277 {
278 client = etcdReqInfo.current->getIp();
279 }
280 else
281 {
282 client = "NULL";
283 }
284 return "client=" + client + "|" + \
285 ActionStr(etcdReqInfo.etcdAction) + "|" +\
286 etcdReqInfo.moduletype + "." + \
287 etcdReqInfo.application + "." + \
288 etcdReqInfo.service_name + "|" + \
289 etcdReqInfo.node_name + "_" + \
290 etcdReqInfo.container_name + "|etcdhost=" + \
291 etcdReqInfo.etcdHost + ":" + \
292 TC_Common::tostr(etcdReqInfo.etcdPort) + "|" + \
293 MSTIMEINSTR(etcdReqInfo.startTime);
294 }
其中etcdReqInfo.current->getIp();
也是直接返回_ip这个字符串,这个字符串初始值也是“NULL”,整个过程看起来并没有什么问题。目前是改写了一下这个函数,新加了一些日志,期望下次 core 的时候能有额外的发现
回头来继续看了一下这个问题,对最近的几次 core 分析,代码有修改,位置也有所不同,都最终都是 core 在
ETCDPROC_LOG << ETCDFILE_FUN << "|response,ret= " << ret << "|retryTime= " << retryTime << endl;
这一句里,只是每次断点位置不太一样,鉴于其它部分都是字符串拼接,这个 logger有问题的可能性比较大。其中一次的断点位置在 LoggerStream 的析构函数里,目前触发条件仍未理清,但感觉排查方向应该是对了
#0 0x00007fe20422aa1c in free () from /lib64/libc.so.6
#1 0x000000000063c7eb in std::ios_base::~ios_base() () at ../../../.././libstdc++-v3/src/c++98/ios.cc:93
#2 0x00000000004b6f56 in tars::LoggerStream::~LoggerStream() () at /mnt/homework/gcc/gcc/include/c++/4.8.2/bits/basic_ios.h:276