googletest icon indicating copy to clipboard operation
googletest copied to clipboard

fix multi construct of g_gmock_implicit_sequence

Open CastleOnTheHill opened this issue 1 year ago • 1 comments

When libgmock.a is linked to a multitude of shared libraries, g_gmock_implicit_sequence will be constructed multiple times. For instance:

libgmock.a is linked to libA.so
libgmock.a is linked to libB.so
libA.so and libB.so are linked to BinaryC

When BinaryC executes, g_gmock_implicit_sequence is constructed and destroyed twice, and the private member const pthread_key_t key_ in ThreadLocal will be called in pthread_key_delete twice, resulting in the failure of GTEST_CHECK_POSIX_SUCCESS_(pthread_key_delete(key_)).

Furthermore, ThreadLocal<Sequence*> g_gmock_implicit_sequence is not trivially destructible, which violates the google code style.

CastleOnTheHill avatar May 13 '24 15:05 CastleOnTheHill

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

google-cla[bot] avatar May 13 '24 15:05 google-cla[bot]

Hi! I'm a little bit confused, how does moving the global into a function avoid the problem of multiple constructions? Wouldn't each shared library still get its own copy of that global? Perhaps the constructions would be avoided if the function isn't called from both libraries, but if/when it is, you should still get two constructions, no?

It seems to me that what you may really want is for libgmock to be shared rather than static?

higher-performance avatar May 21 '24 20:05 higher-performance

FWIW I am seeing this same issue when compiling against the googletest source, so I don't think the shared / static conversation has anything to do with it. Having a hard time reproducing an MRE but hope to be able to craft one soon.

Here's some backtraces from gdb to illustrate - the first and the last breakpoints are from an object at the same memory address, trying to delete the same pthread key twice and causing UB

Breakpoint 1, testing::internal::ThreadLocal<testing::Sequence*>::~ThreadLocal (this=0x555555790900 <testing::internal::g_gmock_implicit_sequence>, __in_chrg=<optimized out>) at ../subprojects/googletest-1.14.0/googletest/include/gtest/internal/gtest-port.h:1764
1764	  ~ThreadLocal() {
(gdb) bt
#0  testing::internal::ThreadLocal<testing::Sequence*>::~ThreadLocal (this=0x555555790900 <testing::internal::g_gmock_implicit_sequence>, 
    __in_chrg=<optimized out>) at ../subprojects/googletest-1.14.0/googletest/include/gtest/internal/gtest-port.h:1764
#1  0x00007ffff7045495 in __run_exit_handlers (status=0, listp=0x7ffff721a838 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, 
    run_dtors=run_dtors@entry=true) at ./stdlib/exit.c:113
#2  0x00007ffff7045610 in __GI_exit (status=<optimized out>) at ./stdlib/exit.c:143
#3  0x00007ffff7029d97 in __libc_start_call_main (main=main@entry=0x55555570216f <main(int, char**)>, argc=argc@entry=1, argv=argv@entry=0x7fffffffda28)
    at ../sysdeps/nptl/libc_start_call_main.h:74
#4  0x00007ffff7029e40 in __libc_start_main_impl (main=0x55555570216f <main(int, char**)>, argc=1, argv=0x7fffffffda28, init=<optimized out>, 
    fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffda18) at ../csu/libc-start.c:392
#5  0x0000555555643725 in _start ()
(gdb) c
Continuing.

Breakpoint 1, testing::internal::ThreadLocal<std::vector<testing::internal::TraceInfo, std::allocator<testing::internal::TraceInfo> > >::~ThreadLocal (this=0x5555557a3d18, __in_chrg=<optimized out>) at ../subprojects/googletest-1.14.0/googletest/include/gtest/internal/gtest-port.h:1764
1764	  ~ThreadLocal() {
(gdb) bt
#0  testing::internal::ThreadLocal<std::vector<testing::internal::TraceInfo, std::allocator<testing::internal::TraceInfo> > >::~ThreadLocal (
    this=0x5555557a3d18, __in_chrg=<optimized out>) at ../subprojects/googletest-1.14.0/googletest/include/gtest/internal/gtest-port.h:1764
#1  0x00005555556cf74b in testing::internal::UnitTestImpl::~UnitTestImpl (this=0x5555557a3ad0, __in_chrg=<optimized out>)
    at ../subprojects/googletest-1.14.0/googletest/src/gtest.cc:5593
#2  0x00005555556cf870 in testing::internal::UnitTestImpl::~UnitTestImpl (this=0x5555557a3ad0, __in_chrg=<optimized out>)
    at ../subprojects/googletest-1.14.0/googletest/src/gtest.cc:5593
#3  0x00005555556cf12e in testing::UnitTest::~UnitTest (this=0x555555790860 <testing::UnitTest::GetInstance()::instance>, __in_chrg=<optimized out>)
    at ../subprojects/googletest-1.14.0/googletest/src/gtest.cc:5538
#4  0x00007ffff7045495 in __run_exit_handlers (status=0, listp=0x7ffff721a838 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, 
    run_dtors=run_dtors@entry=true) at ./stdlib/exit.c:113
#5  0x00007ffff7045610 in __GI_exit (status=<optimized out>) at ./stdlib/exit.c:143
#6  0x00007ffff7029d97 in __libc_start_call_main (main=main@entry=0x55555570216f <main(int, char**)>, argc=argc@entry=1, argv=argv@entry=0x7fffffffda28)
    at ../sysdeps/nptl/libc_start_call_main.h:74
#7  0x00007ffff7029e40 in __libc_start_main_impl (main=0x55555570216f <main(int, char**)>, argc=1, argv=0x7fffffffda28, init=<optimized out>, 
    fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffda18) at ../csu/libc-start.c:392
#8  0x0000555555643725 in _start ()
(gdb) c
Continuing.

Breakpoint 1, testing::internal::ThreadLocal<testing::TestPartResultReporterInterface*>::~ThreadLocal (this=0x5555557a3b60, __in_chrg=<optimized out>) at ../subprojects/googletest-1.14.0/googletest/include/gtest/internal/gtest-port.h:1764
1764	  ~ThreadLocal() {
(gdb) bt
#0  testing::internal::ThreadLocal<testing::TestPartResultReporterInterface*>::~ThreadLocal (this=0x5555557a3b60, __in_chrg=<optimized out>)
    at ../subprojects/googletest-1.14.0/googletest/include/gtest/internal/gtest-port.h:1764
#1  0x00005555556cf811 in testing::internal::UnitTestImpl::~UnitTestImpl (this=0x5555557a3ad0, __in_chrg=<optimized out>)
    at ../subprojects/googletest-1.14.0/googletest/src/gtest.cc:5593
#2  0x00005555556cf870 in testing::internal::UnitTestImpl::~UnitTestImpl (this=0x5555557a3ad0, __in_chrg=<optimized out>)
    at ../subprojects/googletest-1.14.0/googletest/src/gtest.cc:5593
#3  0x00005555556cf12e in testing::UnitTest::~UnitTest (this=0x555555790860 <testing::UnitTest::GetInstance()::instance>, __in_chrg=<optimized out>)
    at ../subprojects/googletest-1.14.0/googletest/src/gtest.cc:5538
#4  0x00007ffff7045495 in __run_exit_handlers (status=0, listp=0x7ffff721a838 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, 
    run_dtors=run_dtors@entry=true) at ./stdlib/exit.c:113
#5  0x00007ffff7045610 in __GI_exit (status=<optimized out>) at ./stdlib/exit.c:143
#6  0x00007ffff7029d97 in __libc_start_call_main (main=main@entry=0x55555570216f <main(int, char**)>, argc=argc@entry=1, argv=argv@entry=0x7fffffffda28)
    at ../sysdeps/nptl/libc_start_call_main.h:74
#7  0x00007ffff7029e40 in __libc_start_main_impl (main=0x55555570216f <main(int, char**)>, argc=1, argv=0x7fffffffda28, init=<optimized out>, 
    fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffda18) at ../csu/libc-start.c:392
#8  0x0000555555643725 in _start ()
(gdb) c
Continuing.

Breakpoint 1, testing::internal::ThreadLocal<testing::Sequence*>::~ThreadLocal (this=0x555555790900 <testing::internal::g_gmock_implicit_sequence>, __in_chrg=<optimized out>) at ../subprojects/googletest-1.14.0/googletest/include/gtest/internal/gtest-port.h:1764
1764	  ~ThreadLocal() {
(gdb) bt
#0  testing::internal::ThreadLocal<testing::Sequence*>::~ThreadLocal (this=0x555555790900 <testing::internal::g_gmock_implicit_sequence>, 
    __in_chrg=<optimized out>) at ../subprojects/googletest-1.14.0/googletest/include/gtest/internal/gtest-port.h:1764
#1  0x00007ffff7045a56 in __cxa_finalize (d=0x7ffff7de6350) at ./stdlib/cxa_finalize.c:83
#2  0x00007ffff7bb9e77 in __do_global_dtors_aux () from /home/willayd/clones/arrow-adbc/c/builddir/driver/sqlite/../../validation/libadbc_validation.so
#3  0x00007fffffffd8a0 in ?? ()
#4  0x00007ffff7fc924e in _dl_fini () at ./elf/dl-fini.c:142

WillAyd avatar Jun 04 '24 16:06 WillAyd

FWIW I am seeing this same issue when compiling against the googletest source, so I don't think the shared / static conversation has anything to do with it.

Not sure if I understand what you're doing correctly, but compiling each of 2 libraries against the source of the same 3rd is equivalent to static linking as far as this problem is concerned, and hence (as expected in my last comment) you'd expect this to come up there too. But yeah, an MRE would be helpful whenever you have one.

higher-performance avatar Jun 04 '24 17:06 higher-performance

Ah OK - thanks for that info.

I am using the Meson wrap for googletest which just provides the source file to other compilation targets. I'll see if I can work around that and get a shared library compiled instead, or try to get this down to something more debuggable. Thanks for the quick reply

WillAyd avatar Jun 04 '24 17:06 WillAyd

To be clear though - I don't have the same linkage situation as the OP, just am getting the same error. My linker command looks something like:

c++ -o driver/sqlite/adbc-driver-sqlite-test
subprojects_googletest-1.14.0_googletest_src_gtest-all.cc.o
subprojects_googletest-1.14.0_googlemock_src_gmock-all.cc.o
subprojects_googletest-1.14.0_googlemock_src_gmock_main.cc.o
...

WillAyd avatar Jun 04 '24 20:06 WillAyd

While not quite an MRE you can reasonably see this if you work with this project:

https://github.com/WillAyd/arrow-adbc/tree/aeccd962d530a61f31909c083a05412110e77f2b

With the following compilation steps:

cd c
meson setup builddir -Dtests=true -Dsqlite=true && cd builddir
meson compile

Then either meson test or ./driver/sqlite/adbc-driver-sqlite-test for an even more minimal interaction

WillAyd avatar Jun 04 '24 21:06 WillAyd

Thank you, but I suspect we won't have the bandwidth to to track down such an issue in a massive project like this. I'd really need something minimal that I can understand by eyeballing.

higher-performance avatar Jun 04 '24 21:06 higher-performance

Totally understood - no expectations for you to go through those steps. Just providing as reference for now

WillAyd avatar Jun 04 '24 21:06 WillAyd

I'm going to close this PR, since I think we need to see something indicating that (a) an issue actually exists inside GoogleTest (as opposed to the issue being that you should be using shared libraries instead of static), and that (b) this PR correctly fixes it. So far neither seems to be the case to me, but if anything comes up indicating otherwise, please do feel free to follow up.

higher-performance avatar Jun 17 '24 12:06 higher-performance

FWIW the issue I was posting about before become clearer when compiling with ASAN, which ultimately flagged a violation of ODR based off of how I was linking everything together. Sharing in case that is helpful to others

Thanks @higher-performance for your guidance and keeping this on the right track

WillAyd avatar Jul 01 '24 15:07 WillAyd

As @higher-performance @WillAyd said, compiling GoogleTest into a dynamic library can circumvent this problem. However, GoogleTest is compiled into a static library by DEFAULT. Compiling it into a dynamic library increases the difficulty of use. Why can't we solve this problem under the default behavior?

At present, I haven't figured out the specific reason yet, but this phenomenon is easy to reproduce and is bound to occur. Here is the very easy example I gave (only 4 files). As I mentioned before, this also violates the Google programming specification at the same time. Why not change it?

arvin@arvins:/mnt/d/code/googletest_issue/build$ ./testbin
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from T1
[ RUN      ] T1.T2
[       OK ] T1.T2 (0 ms)
[----------] 1 test from T1 (0 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (0 ms total)
[  PASSED  ] 1 test.

[ FATAL ] /mnt/d/code/googletest_issue/googletest-main/googletest/include/gtest/internal/gtest-port.h:1810:: pthread_key_delete(key_)failed with error 22
Aborted
├── CMakeLists.txt
├── build
│   ├── liblib1.so
│   ├── liblib2.so
│   └── testbin
├── googletest-main

└── src
    ├── lib1.cpp
    ├── lib2.cpp
    └── main.cpp
// lib1.cpp
#include "gtest/gtest.h"
#include "gmock/gmock.h"

using namespace testing;

void HelpFunc1(void)
{
    InSequence A;
    EXPECT_EQ(1, 1);
}

// lib2.cpp
#include "gtest/gtest.h"
#include "gmock/gmock.h"

using namespace testing;

void HelpFunc2(void)
{
    InSequence B;
    EXPECT_EQ(1, 1);
}

// main.cpp
#include <iostream>
#include "gtest/gtest.h"
#include "gmock/gmock.h"

using namespace std;
using namespace testing;

extern void HelpFunc1(void);
extern void HelpFunc2(void);

TEST(T1, T2)
{
    HelpFunc1();
    HelpFunc2();
}

int main(int argc, char *argv[])
{
    InitGoogleTest(&argc, argv);
    return RUN_ALL_TESTS();
}

// CMakeLists.txt
project(Test)

set(CMAKE_POSITION_INDEPENDENT_CODE ON)

add_subdirectory(googletest-main)

add_library(lib1 SHARED src/lib1.cpp)
target_link_libraries(lib1 gmock_main gtest_main)

add_library(lib2 SHARED src/lib2.cpp)
target_link_libraries(lib2 gmock_main gtest_main)

add_executable(testbin src/main.cpp)
target_link_libraries(testbin lib1 lib2)

CastleOnTheHill avatar Jul 23 '24 15:07 CastleOnTheHill

@CastleOnTheHill: Fundamentally it doesn't really make sense to have multiple copies of GoogleTest inside a single program to begin with. To give an analogy: it's like designing a car so that any passenger could also be a driver. (!) Sure, maybe it would be possible with enough effort, but it comes with a lot of effort, complexity, confusion, and fragility that just doesn't seem worth it.

higher-performance avatar Jul 23 '24 15:07 higher-performance

@higher-performance

Fundamentally it doesn't really make sense to have multiple copies of GoogleTest inside a single program to begin with.

Yes, but it is very common to use GoogleTest to encapsulate some public stub SOs in medium and large-scale tests. Then, under the default configuration (compiling into a static library), is it good to let everyone step into this pit?

Sure, maybe it would be possible with enough effort, but it comes with a lot of effort, complexity, confusion, and fragility that just doesn't seem worth it.

But in fact, the modification we made to improve usability is very small. We only changed a global variable into a static variable within a function, and only modified 11 lines in 2 files.

Still, you haven't responded to the matter of the Google programming specification I mentioned. Has this specification no longer been applicable within Google now?

CastleOnTheHill avatar Jul 23 '24 16:07 CastleOnTheHill

@CastleOnTheHill, if this was as simple as merging your PR, we would have done so.

I think you're under the (I believe mistaken) impression that your workaround actually fixes the problem. As I tried to explain above, the issue lies deeper. Your fix merely masks the underlying bug sufficiently in your scenario that it produces OK behavior. That is not a sufficient criterion for merging this PR. What we very much don't want is to mislead users into thinking something is supported, only for it to silently break on them in a different scenario, or the future. That's worse than having it break on them from the start.

Please understand that this means that in order for us to actually support multiple copies of GoogleTest within a binary, we would have to do most/all of the following:

  • Ensure/enforce that there is no way to actually trigger the wrong behavior with your workaround (as opposed to merely making sure it works in your specific scenario)
  • Audit the existing codebase to verify that other hidden bugs don't exist
  • Handle incompatibilities across different GoogleTest versions within a binary, because people will eventually try that too (and we don't want silent failures there)
  • Add a test suite for all the various mixed-static-linking scenarios of interest, to make sure future updates to GoogleTest don't break the guarantee of mixed-object-file compatibility
  • ...and possibly more steps I haven't thought of here.

This is a significantly larger amount of effort—not just at present, but also in the future—beyond simply merging your PR.

I would suggest that, if you believe your PR is indeed sufficient for your use case, you simply carry a local patch with that modification included.

higher-performance avatar Jul 23 '24 20:07 higher-performance

@higher-performance You have a deeper and broader perspective than me. Thank you for your answer, which has enabled me to see that so many things need to be considered in the iteration of a widely used project. Static compilation and construction is indeed complex, and I still haven't figured out the reason for the above problem. This simple modification is indeed far from enough to solve these problems you mentioned.

But in terms of this modification itself, I think the impact it brings is clear, and I didn't expect it to have any drawbacks. Transforming a global non-trivial object into a static object within a function, its construction order is determined and traceable, that is, it will be constructed only when it is called for the first time. This enables us to avoid some complex problems as stated here.

Can we merge after changing the submission message? From "bug fix" to "optimization of global non-trivial objects".

CastleOnTheHill avatar Jul 24 '24 16:07 CastleOnTheHill

I didn't expect it to have any drawbacks.

@CastleOnTheHill the drawback is now it hides a problem that still exists, and the lack of an error would potentially mislead people into thinking we support something that we don't. That's what I was trying to explain in my previous comment.

higher-performance avatar Jul 24 '24 16:07 higher-performance

@higher-performance OK, i see

CastleOnTheHill avatar Jul 25 '24 01:07 CastleOnTheHill