soci icon indicating copy to clipboard operation
soci copied to clipboard

soci::row get throw bad_cast when compiled use clang with c++abi

Open 530967061 opened this issue 4 years ago • 10 comments

When compile soci use clang with c++abi and 'SOCI_VISIBILITY' ON, then many soci::row dynamic binding test cases failed with std::bad_cast, for example

-------------------------------------------------------------------------------
Dynamic row binding
-------------------------------------------------------------------------------
/app/yacbs/softwear/source/soci-4.0.2/tests/common-tests.h:2321
...............................................................................

/app/yacbs/softwear/source/soci-4.0.2/tests/common-tests.h:2365: FAILED:
  CHECK( r.get<std::tm>(3).tm_year == 105 )
due to unexpected exception with message:
  std::bad_cast

/app/yacbs/softwear/source/soci-4.0.2/tests/common-tests.h:2321: FAILED:
  {Unknown expression after the reported line}
due to unexpected exception with message:
  std::bad_cast

-------------------------------------------------------------------------------
Reading rows from rowset
-------------------------------------------------------------------------------
/app/yacbs/softwear/source/soci-4.0.2/tests/common-tests.h:2862
...............................................................................

/app/yacbs/softwear/source/soci-4.0.2/tests/common-tests.h:2862: FAILED:
  {Unknown expression after the reported line}
due to unexpected exception with message:
  std::bad_cast

-------------------------------------------------------------------------------
This is cause by RTTI usage when sharing C++ objects across binary boundaries.

Here is a basic explanation. Please check it, thanks!

530967061 avatar Oct 10 '21 13:10 530967061

Sorry, I'm unlikely to have time to look at this in the near future. If you can check casting to which class exactly fails, it could help us to fix the problem.

vadz avatar Oct 10 '21 15:10 vadz

class holder dynamic_cast to class type_holder in type-holder.h

class holder
{
public:
    holder() {}
    virtual ~holder() {}

    template<typename T>
    T get()
    {
        type_holder<T>* p = dynamic_cast<type_holder<T> *>(this);
        if (p)
        {
            return p->template value<T>();
        }
        else
        {
            throw std::bad_cast();
        }
    }

private:

    template<typename T>
    T value();
};

As <<More RTTI, More Problems>> said:

Having a class marked hidden in one binary and default in another

This can happen if you are using a macro system to mark your classes that breaks down somehow. I've also seen this when template class explicit instantiations are not marked with the correct visibility. This will result in the exact same sort of problems seen above, as the weak symbol loading will not coalesce the weak external version with the non-external version.

Similarly to the last case, to detect this, run nm -mo * | c++filt and ensure that for any typeinfos that have at least one non-external copies, there are no other copies of that typeinfo.

I think better use other method instead of RTTI. Moreover, soci::row has column properties info, so I think we can do type conversion automatically if no precision lost.

530967061 avatar Oct 10 '21 15:10 530967061

So it's the cast to type_holder<T> which fails? I don't really see what could it be replaced with, we do need a working typeinfo here.

Could marking type_holder (and/or holder itself?) as visible help? I admit I don't fully understand the problem, even after skimming the linked article. It seems like all the weak symbols ought to coalesce together in our case and if they don't, I don't see what prevents them from doing it.

vadz avatar Oct 10 '21 15:10 vadz

I used soci_sqlite3_test for test and:

ldd soci_sqlite3_test
	linux-vdso.so.1 =>  (0x00007fffaede6000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f41158a5000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f41156a1000)
	libsqlite3.so.0 => /lib64/libsqlite3.so.0 (0x00007f41153ec000)
	libsoci_sqlite3.so.4.0 => /app/yacbs/softwear/source/soci-4.0.2/build/lib/libsoci_sqlite3.so.4.0 (0x00007f4115c9e000)
	libsoci_core.so.4.0 => /app/yacbs/softwear/source/soci-4.0.2/build/lib/libsoci_core.so.4.0 (0x00007f4115bfb000)
	libboost_date_time-clang12-mt-x64-1_77.so.1.77.0 => /app/yacbs/softwear/boost_1_77_0/lib/libboost_date_time-clang12-mt-x64-1_77.so.1.77.0 (0x00007f4115bf7000)
	libc++.so.1 => /opt/clang-12.0.1/lib/x86_64-unknown-linux-gnu/c++/libc++.so.1 (0x00007f4115b25000)
	libunwind.so.1 => /opt/clang-12.0.1/lib/x86_64-unknown-linux-gnu/c++/libunwind.so.1 (0x00007f4115b13000)
	libc++abi.so.1 => /opt/clang-12.0.1/lib/x86_64-unknown-linux-gnu/c++/libc++abi.so.1 (0x00007f41153aa000)
	libm.so.6 => /lib64/libm.so.6 (0x00007f41150a8000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f4114cda000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f4115ac1000)
	librt.so.1 => /lib64/librt.so.1 (0x00007f4114ad2000)
	libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f41148bc000)

When:

I tuned 'SOCI_VISIBILITY' OFF, all test cases passed. Add SOCI_DECL for type_holder<T> and holder, 2 test cases failed Ohterwise, 10 test cases failed soci_sqlite3_test_static has no problem.

530967061 avatar Oct 10 '21 15:10 530967061

Add SOCI_DECL for type_holder and holder, 2 test cases failed

OK, so it looks like it's a good idea to at least do this. What are the remaining failures?

soci_sqlite3_test_static has no problem.

Yes, of course, this problem only happens when using dynamic linking.

vadz avatar Oct 10 '21 15:10 vadz

Remaining failures are also holder bad_cast to type_holder<T>. Then I tried attribute (( visibility("default") )) instead of SOCI_DECL for type_holder<T> and holder, also 2 test cases failed. There must be some other errors happened with RTTI. Here another discuss about RTTI with clang c++abi. So I think RTTI may not be a good choice.

530967061 avatar Oct 10 '21 16:10 530967061

Another sample for RTTI failure with clang If I compiled with clang and pass -DCMAKE_CXX_FLAGS='-stdlib=libc++' to cmake. Then ./main print:

pBase=0x2176010, pInherit=(nil)

This may help use to understand what <<More RTTI, More Problems>> say.

530967061 avatar Oct 10 '21 16:10 530967061

Sorry, as I said, I don't really see how can we avoid using RTTI here. I do realize that there is a problem here, but I can't work on it now, so all I can recommend is to either use static linking or to avoid using row-based API as a workaround for now.

vadz avatar Oct 10 '21 16:10 vadz

OK, I will tune 'SOCI_VISIBILITY' OFF now and I think something like std::variant<int, long long, unsigned long long, std::string, ...> or std::any as value holders may be helpful.

530967061 avatar Oct 10 '21 16:10 530967061

I have initiated a 'pull request' to fix this problem, please help to see if it can be merged. Thanks.

530967061 avatar Nov 21 '21 15:11 530967061

The real question for me here is where the key function is. That is, do the holder classes have a non-inline member function that can be used to anchor the typeinfo? That member should be in SOCI with explicit visibility and it should trigger the correct visibility of the typeinfo class well. Without it, -rdynamic is necessary so that the main binary actually has an export in its dynamic symbol table to override the library.

jsonn avatar Sep 18 '22 14:09 jsonn

I'm not sure what is the question, but in practice it's just impossible to use dynamic_cast<> with clang/libc++ and shared libraries reliably (FWIW I think their decision to optimize it by comparing type info objects by address is completely wrong, but it's clear that they're not going to change it), so doing what I did is the only solution, IMO.

vadz avatar Sep 18 '22 15:09 vadz