brpc icon indicating copy to clipboard operation
brpc copied to clipboard

struct/class that require cacheline alignment may not work when using new (before c++17).

Open ehds opened this issue 2 years ago • 6 comments

Is your feature request related to a problem? (你需要的功能是否与某个问题有关?) 当前 brpc 代码默认使用的 C++ 标准为 11,不支持 align new (since c++17 https://en.cppreference.com/w/cpp/memory/new/operator_new) .

如果某个 class 指定了 alignment 要求(例如 BAIDU_CACHELINE_ALIGNMENT),代码中使用 new 的方式来分配其对象时,地址有可能并不是严格按照其对齐方式的。

要让一个变量或结构体按cacheline对齐,可以include <butil/macros.h>后使用BAIDU_CACHELINE_ALIGNMENT宏,请自行grep brpc的代码了解用法。

https://github.com/apache/brpc/blob/master/docs/cn/atomic_instructions.md#cacheline

例如以下代码,使用当前的编译选项,就可能会出现错误。


class BAIDU_CACHELINE_ALIGNMENT A {
    int i;
};

int main() {
    for(size_t i =0 ;i<100;i++) {
        A* a = new A();
        // maybe fail.
        assert((reinterpret_cast<uintptr_t>(a) & (63)) == 0);
     }

Describe the solution you'd like (描述你期望的解决方法)

使用 new 来分配指定对齐要求的类时,内存地址应满足对齐的要求。 升级为 C++17 标准,或者开启 -faligned_new (gcc 7.4+,clang 7.1.0+ 都已经支持) .

目前来看是强行关闭了该警告信息(不知道具体原因). https://github.com/apache/brpc/blob/f3fe5fc4ff315aeed1f1e4b6c43c2ebf470d4381/CMakeLists.txt#L70-L72

Describe alternatives you've considered (描述你想到的折衷方案) 对于需要对齐的类,在使用 new 分配内存时使用 aligned_alloc/posix_memalign 等函数申请 alignment 内存,再使用 Placement new 指定内存空间进行初始化.

Additional context/screenshots (更多上下文/截图) os: 20.04.1-Ubuntu compiler: clang version 10.0.0-4ubuntu1 cpu: x86_64, cache_alignment : 64 byte

例如对于 class BAIDU_CACHELINE_ALIGNMENT/*note*/ Socket 类:

image

socket 的地址为 0x00005555567f1530, 并不是 64 byte 对齐,违反了要求,可能引起 false-sharing.

ehds avatar Oct 19 '23 07:10 ehds

没开-faligned-new还用new来动态分配__attribute__((aligned()))标记的类, 真是aligned了个寂寞

是的,这里还有另外一个隐含的问题。 如果 brpc 作为第三方库,编译时未加 -faligned-new, 但项目中使用了其头文件中的声明的某个类(cache line alignment),并且项目编译使用了 aligned_new, 而该类的析构函数(在 brpc 三方库中定义)调用 delete 释放对象空间时也不会调用

void operator delete  ( void* ptr, std::size_t sz,  std::align_val_t al ) noexcept;

也就是说 new (aligned_new) 和 delete 不匹配,delete 的 size 可能会小于实际 new 的 size (为类对齐要求的 padding)。 这里可能会产生一些问题, 例如一些项目中覆盖了 malloc/free,new/delete 定制化了 allocator,会导致申请和释放的大小不一致。 或者在开启了 ASAN 检查时,会检查出 new/delete 类型不匹配。

ehds avatar Oct 22 '23 10:10 ehds

=================================================================
==155368==ERROR: AddressSanitizer: new-delete-type-mismatch on 0x621000030d00 in thread T22 (brpc_timer):
  object passed to delete has wrong type:
  size of the allocated type:   4096 bytes;
  size of the deallocated type: 4096 bytes.
  alignment of the allocated type:   default-aligned;
  alignment of the deallocated type: 64 bytes.
    #0 0x7ff6e41a35d5 in operator delete(void*, unsigned long, std::align_val_t) (/usr/lib64/libasan.so.5+0x1105d5)
    #1 0x752bc9 in bvar::detail::AgentGroup<bvar::detail::AgentCombiner<long, long, bvar::detail::AddTo<long> >::Agent>::_destroy_tls_blocks() external/com_github_apache_brpc/src/bvar/detail/agent_group.h:166
    #2 0x752bc9 in bvar::detail::AgentGroup<bvar::detail::AgentCombiner<long, long, bvar::detail::AddTo<long> >::Agent>::_destroy_tls_blocks() external/com_github_apache_brpc/src/bvar/detail/agent_group.h:161
    #3 0x150d34f in butil::detail::ThreadExitHelper::~ThreadExitHelper() external/com_github_apache_brpc/src/butil/thread_local.cpp:41
    #4 0x150d34f in delete_thread_exit_helper external/com_github_apache_brpc/src/butil/thread_local.cpp:77
    #5 0x150d34f in delete_thread_exit_helper external/com_github_apache_brpc/src/butil/thread_local.cpp:76
    #6 0x7ff6e3c68ca1 in __nptl_deallocate_tsd /usr/src/debug/glibc-2.17-c758a686/nptl/pthread_create.c:155
    #7 0x7ff6e3c68eb2 in start_thread /usr/src/debug/glibc-2.17-c758a686/nptl/pthread_create.c:314
    #8 0x7ff6e2690b0c in clone (/lib64/libc.so.6+0xfeb0c)

0x621000030d00 is located 0 bytes inside of 4096-byte region [0x621000030d00,0x621000031d00)
allocated by thread T22 (brpc_timer) here:
    #0 0x7ff6e41a1767 in operator new(unsigned long, std::nothrow_t const&) (/usr/lib64/libasan.so.5+0x10e767)
    #1 0x1227afe in bvar::detail::AgentGroup<bvar::detail::AgentCombiner<long, long, bvar::detail::AddTo<long> >::Agent>::get_or_create_tls_agent(int) external/com_github_apache_brpc/src/bvar/detail/agent_group.h:150
    #2 0x1227afe in bvar::detail::AgentCombiner<long, long, bvar::detail::AddTo<long> >::get_or_create_tls_agent() external/com_github_apache_brpc/src/bvar/detail/combiner.h:297
    #3 0x1227afe in bvar::Reducer<long, bvar::detail::AddTo<long>, bvar::detail::MinusFrom<long> >::operator<<(long) external/com_github_apache_brpc/src/bvar/reducer.h:194
    #4 0x1227afe in int bthread::TaskGroup::start_background<true>(unsigned long*, bthread_attr_t const*, void* (*)(void*), void*) external/com_github_apache_brpc/src/bthread/task_group.cpp:453
    #5 0x11fa28f in bthread::start_from_non_worker(unsigned long*, bthread_attr_t const*, void* (*)(void*), void*) external/com_github_apache_brpc/src/bthread/bthread.cpp:150
    #6 0x11fa28f in bthread_start_background external/com_github_apache_brpc/src/bthread/bthread.cpp:230
    #7 0xaac90d in braft::RepeatedTimerTask::on_timedout(void*) external/com_meituan_mraft/src/braft/repeated_timer_task.cpp:120
    #8 0x1200a04 in bthread::TimerThread::Task::run_and_delete() external/com_github_apache_brpc/src/bthread/timer_thread.cpp:282
    #9 0x1202fb5 in bthread::TimerThread::run() external/com_github_apache_brpc/src/bthread/timer_thread.cpp:395
    #10 0x12075ef in bthread::TimerThread::run_this(void*) external/com_github_apache_brpc/src/bthread/timer_thread.cpp:122
    #11 0x7ff6e3c68ea4 in start_thread /usr/src/debug/glibc-2.17-c758a686/nptl/pthread_create.c:307

I met the same problem. I compile my project in C++17, while the compilation option for BRPC is C++11, so ASAN detects the dismatch between new and delete.

uvletter avatar Feb 26 '24 10:02 uvletter

I met the same problem. I compile my project in C++17, while the compilation option for BRPC is C++11, so ASAN detects the dismatch between new and delete.

https://github.com/apache/brpc/pull/2421 Try this fix.

ehds avatar Feb 26 '24 14:02 ehds

Thx, my current workaround is compiling brpc with std=c++17 too. Just wanna report the unworkable cacheline alignment and underlying risk in mixed compilation units.

获取 Outlook for iOShttps://aka.ms/o0ukef


发件人: Dongsheng He @.> 发送时间: Monday, February 26, 2024 10:56:16 PM 收件人: apache/brpc @.> 抄送: Tian @.>; Comment @.> 主题: Re: [apache/brpc] struct/class that require cacheline alignment may not work when using new (before c++17). (Issue #2416)

================================================================= ==155368==ERROR: AddressSanitizer: new-delete-type-mismatch on 0x621000030d00 in thread T22 (brpc_timer): object passed to delete has wrong type: size of the allocated type: 4096 bytes; size of the deallocated type: 4096 bytes. alignment of the allocated type: default-aligned; alignment of the deallocated type: 64 bytes. #0 0x7ff6e41a35d5 in operator delete(void*, unsigned long, std::align_val_t) (/usr/lib64/libasan.so.5+0x1105d5) #1 0x752bc9 in bvar::detail::AgentGroup<bvar::detail::AgentCombiner<long, long, bvar::detail::AddTo >::Agent>::_destroy_tls_blocks() external/com_github_apache_brpc/src/bvar/detail/agent_group.h:166 #2 0x752bc9 in bvar::detail::AgentGroup<bvar::detail::AgentCombiner<long, long, bvar::detail::AddTo >::Agent>::_destroy_tls_blocks() external/com_github_apache_brpc/src/bvar/detail/agent_group.h:161 #3 0x150d34f in butil::detail::ThreadExitHelper::~ThreadExitHelper() external/com_github_apache_brpc/src/butil/thread_local.cpp:41 #4 0x150d34f in delete_thread_exit_helper external/com_github_apache_brpc/src/butil/thread_local.cpp:77 #5 0x150d34f in delete_thread_exit_helper external/com_github_apache_brpc/src/butil/thread_local.cpp:76 #6 0x7ff6e3c68ca1 in __nptl_deallocate_tsd /usr/src/debug/glibc-2.17-c758a686/nptl/pthread_create.c:155 #7 0x7ff6e3c68eb2 in start_thread /usr/src/debug/glibc-2.17-c758a686/nptl/pthread_create.c:314 #8 0x7ff6e2690b0c in clone (/lib64/libc.so.6+0xfeb0c)

0x621000030d00 is located 0 bytes inside of 4096-byte region [0x621000030d00,0x621000031d00) allocated by thread T22 (brpc_timer) here: #0 0x7ff6e41a1767 in operator new(unsigned long, std::nothrow_t const&) (/usr/lib64/libasan.so.5+0x10e767) #1 0x1227afe in bvar::detail::AgentGroup<bvar::detail::AgentCombiner<long, long, bvar::detail::AddTo >::Agent>::get_or_create_tls_agent(int) external/com_github_apache_brpc/src/bvar/detail/agent_group.h:150 #2 0x1227afe in bvar::detail::AgentCombiner<long, long, bvar::detail::AddTo >::get_or_create_tls_agent() external/com_github_apache_brpc/src/bvar/detail/combiner.h:297 #3 0x1227afe in bvar::Reducer<long, bvar::detail::AddTo, bvar::detail::MinusFrom >::operator<<(long) external/com_github_apache_brpc/src/bvar/reducer.h:194 #4 0x1227afe in int bthread::TaskGroup::start_background(unsigned long*, bthread_attr_t const*, void* ()(void), void*) external/com_github_apache_brpc/src/bthread/task_group.cpp:453 #5 0x11fa28f in bthread::start_from_non_worker(unsigned long*, bthread_attr_t const*, void* ()(void), void*) external/com_github_apache_brpc/src/bthread/bthread.cpp:150 #6 0x11fa28f in bthread_start_background external/com_github_apache_brpc/src/bthread/bthread.cpp:230 #7 0xaac90d in braft::RepeatedTimerTask::on_timedout(void*) external/com_meituan_mraft/src/braft/repeated_timer_task.cpp:120 #8 0x1200a04 in bthread::TimerThread::Task::run_and_delete() external/com_github_apache_brpc/src/bthread/timer_thread.cpp:282 #9 0x1202fb5 in bthread::TimerThread::run() external/com_github_apache_brpc/src/bthread/timer_thread.cpp:395 #10 0x12075ef in bthread::TimerThread::run_this(void*) external/com_github_apache_brpc/src/bthread/timer_thread.cpp:122 #11 0x7ff6e3c68ea4 in start_thread /usr/src/debug/glibc-2.17-c758a686/nptl/pthread_create.c:307

I met the same problem. I compile my project in C++17, while the compilation option for BRPC is C++11, so ASAN detects the dismatch between new and delete.

#2421https://github.com/apache/brpc/pull/2421 Try this fix.

― Reply to this email directly, view it on GitHubhttps://github.com/apache/brpc/issues/2416#issuecomment-1964341023, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEMHU7GMT5HWBZ4A7IDOXYTYVSPBBAVCNFSM6AAAAAA6GV4RTGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRUGM2DCMBSGM. You are receiving this because you commented.Message ID: @.***>

uvletter avatar Feb 26 '24 15:02 uvletter

请问针对这种c++不同版本不匹配的问题,c++官方哪里有个统一的列表说明吗?想看看是否还有别的坑。

yanglimingcn avatar Feb 29 '24 22:02 yanglimingcn