cereal
cereal copied to clipboard
LTO breaks StaticObject on Fedora 37
Code built with g++ with link time optimization (LTO) fails with a "Trying to save an unregistered polymorphic type" exception. The same code works fine without LTO. This is on a stock Fedora 37 machine (with gcc 12.2.1, cereal 1.3.2).
My code is a large mixed C++ and Python project, but I boiled it down to a minimal reproducer here: cereal_test.zip
A.h
defines and registers a polymorphic type Wrapped
and a class Container
that stores a shared_ptr<Wrapped>
. B.h
registers a Wrapped
subclass BWrapped
. We build two dynamic libraries libA.so
and libB.so
and wrap each with SWIG so they can be used from Python as A.py
and B.py
(note we build only the A wrapper with -flto
):
g++ -fPIC -Wall -shared A.cpp -o libA.so
g++ -fPIC -Wall -shared B.cpp -o libB.so
swig -python -c++ A.i
swig -python -c++ B.i
g++ -flto -fPIC -shared A_wrap.cxx -I/usr/include/python3.11 -o _A.so -L. -lA
g++ -fPIC -shared B_wrap.cxx -I/usr/include/python3.11 -o _B.so -L. -lA -lB
If we then try to serialize a Container
object that contains a BWrapped
in Python (the _get_as_binary
method uses cereal to write Container
to a BinaryOutputArchive
and then returns the resulting data), it fails:
$ cat test.py
import A, B
w = B.BWrapped()
c = A.Container(w)
print(c._get_as_binary())
$ python3 test.py
terminate called after throwing an instance of 'cereal::Exception'
what(): Trying to save an unregistered polymorphic type (BWrapped).
If we rebuild A without LTO though, it works fine:
$ g++ -fPIC -shared A_wrap.cxx -I/usr/include/python3.11 -o _A.so -L. -lA
$ python3 test.py
b'\x01\x00\x00\x80\x08\x00\x00\x00\x00\x00\x00\x00BWrapped\x01\x00\x00\x80'
It looks like the problem is that LTO causes StaticObject
to not work correctly. If we add to A.h
a function
void show_a_output_binding_map() {
auto const & bindingMap = cereal::detail::StaticObject<cereal::detail::OutputBindingMap<cereal::BinaryOutputArchive>>::getInstance().map;
std::cerr << "A map is at " << &bindingMap << std::endl;
}
and a similar function to B.h
then with LTO we see
$ cat test.py
import A, B
A.show_a_output_binding_map()
B.show_b_output_binding_map()
w = B.BWrapped()
c = A.Container(w)
print(c._get_as_binary())
$ python3 test.py
A map is at 0x7f029b6fec40
B map is at 0x7f029b3ff540
terminate called after throwing an instance of 'cereal::Exception'
what(): Trying to save an unregistered polymorphic type (BWrapped).
i.e. StaticObject
is not a singleton so when B
registers BWrapped
, A
cannot see it. (Without LTO, the address printed for A map and B map is the same.)
I see cereal has specific code (in detail/static_object.hpp
) to try to prevent link optimization from breaking StaticObject
, but it seems not to be working here. Obviously an easy workaround is "don't use LTO" but I'd like to find a better solution. I can modify the SWIG interface, so perhaps I can add some code to the generated modules that explicitly references StaticObject
and so persuades the linker not to mangle the code?
FWIW, I see the exact same issue when building for Windows (I use MSVS 2015, for 64-bit). (The reproducer code is similar, except that functions need the usual dllexport/import tags so that DLLs work.)
Our workaround for now, linked above, adds a map of serialize/deserialize functions to our application itself, so we can be sure they're stored only in one place. Works for us but it is definitely not as general as cereal's polymorphic machinery.