botan icon indicating copy to clipboard operation
botan copied to clipboard

"pure virtual method called" abort when using Botan from statically allocated object

Open apartridge opened this issue 2 years ago • 5 comments

This is similar to https://github.com/randombit/botan/issues/761.

When using Botan from a class that itself is stored in a static variable, I get a "pure virtual method called" abort during shutdown. We use Botan to encrypt logging. From GDB I can see that ~System_RNG_Impl is executed before the destruction of my static class, which does some logging during destruction. Thus the system_rng() call below will abort after this point.

RandomNumberGenerator& system_rng()
   {
   static System_RNG_Impl g_system_rng;
   return g_system_rng;
   }
class BOTAN_PUBLIC_API(2,0) System_RNG final : public RandomNumberGenerator
   {
   public:
      ...

      void randomize(uint8_t out[], size_t len) override { system_rng().randomize(out, len); }
      
      ..
   };

I have tested different Botan versions:

2.14.0 Works as expected 2.15.0 Works as expected 2.16.0 Aborts 2.17.0 Aborts 2.18.1 Aborts

I don't know why it starts failing at 2.16, since it seems the usage of static System_RNG_Impl existed even in 2.15.

When testing I need to use BOTAN_MLOCK_POOL_SIZE=0 to not hit https://github.com/randombit/botan/issues/761.

This is the stack trace (note that this is a logger worker thread, not the main thread). It aborts after the System_RNG_Impl has been destroyed in the main thread.

__GI_raise 0x00007f7c4b9da18b
__GI_abort 0x00007f7c4b9b9859
<unknown> 0x00007f7c4bc42a31
<unknown> 0x00007f7c4bc4e5dc
std::terminate() 0x00007f7c4bc4e647
__cxa_pure_virtual 0x00007f7c4bc4f3c5
Botan::System_RNG::randomize system_rng.h:30
Botan::Stateful_RNG::randomize_with_ts_input 0x00000000007b2e7b
Botan::AutoSeeded_RNG::randomize 0x00000000007b035c
Botan::RandomNumberGenerator::random_vec<Botan::secure_allocator<unsigned char> > rng.h:154
Botan::RandomNumberGenerator::random_vec rng.h:146
LogEncryptor::encrypt LogEncryptor.cpp:115
LogEncryptor::encrypt LogEncryptor.cpp:82
Log::RotatingEncryptedFileSinkMt::log(spdlog::details::log_msg const&) Logger.cpp:55
spdlog::details::async_log_helper::process_next_msg async_log_helper.h:321
spdlog::details::async_log_helper::worker_loop async_log_helper.h:277
std::__invoke_impl<void, void (spdlog::details::async_log_helper::*)(), spdlog::details::async_log_helper*> invoke.h:73
std::__invoke<void (spdlog::details::async_log_helper::*)(), spdlog::details::async_log_helper*> invoke.h:95
std::thread::_Invoker<std::tuple<void (spdlog::details::async_log_helper::*)(), spdlog::details::async_log_helper*> >::_M_invoke<0ul, 1ul> thread:244
std::thread::_Invoker<std::tuple<void (spdlog::details::async_log_helper::*)(), spdlog::details::async_log_helper*> >::operator() thread:251
std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (spdlog::details::async_log_helper::*)(), spdlog::details::async_log_helper*> > >::_M_run thread:195
<unknown> 0x00007f7c4bc7b6b4
start_thread 0x00007f7c4bdd7609
clone 0x00007f7c4bab6293

apartridge avatar Aug 12 '21 09:08 apartridge

I think it is just a coincidence of shutdown order that you don't hit it with certain versions; nothing has changed here in the last releases.

One option that would work would be to hold a unique_ptr of some other RNG type (such as HMAC_DRBG) which was seeded using the system RNG. Then you would control the lifetime of the object instead of it being subject to static destruction.

One option we could consider is making each System_RNG open a unique handle to the system RNG, instead of all referring ot a single static instance. However that would make creating such objects rather more expensive.

randombit avatar Aug 13 '21 12:08 randombit

One option that would work would be to hold a unique_ptr of some other RNG type (such as HMAC_DRBG) which was seeded using the system RNG. Then you would control the lifetime of the object instead of it being subject to static destruction.

I don't fully follow this. The System_RNG_Impl is allocated whenever someone calls this the first time. When our own Application is allocated statically, it would always be allocated before System_RNG_Impl because our application is what invokes system_rng(). Then they are destroyed in the opposite order, and System_RNG_Impl is thus destroyed before our own Application. Then our own Application crashes during its destructor because System_RNG_Impl is gone. If we could initialize something prior to our own Appliction class, then it would work. I guess that is what you are suggesting above with HMAC_DRBG. Unfortunately in our case we don't control this, this is controlled by our users again.

One option we could consider is making each System_RNG open a unique handle to the system RNG, instead of all referring ot a single static instance. However that would make creating such objects rather more expensive.

That may work for us, we only make one Botan::AutoSeeded_RNG instance in our process .

Another option may be to keep System_RNG_Impl in a shared_ptr which is constructed on first use. Then subsequent RNG instances could re-use that one, and the last instance cleans it up. Something like that may also work for #761.

apartridge avatar Aug 13 '21 15:08 apartridge

To clarify my suggestion only works for something like HMAC_DRBG where the state isn't a singleton. It wouldn't help for system RNG as it exists today.

Having System_RNG refer to a shared_ptr of the internal state might be a good option. Also thanks for suggesting this in context of #761

randombit avatar Aug 18 '21 11:08 randombit

What I ended up doing for now, which seems to work, is to ensure botan's static variables are always initialized before our own Application class is initialized - even if our own Application class is stored in a static variable (in our case, a static std::unique_ptr<Application>).

This is based on the idea behind the "nifty counter" idiom (see fex https://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Nifty_Counter).

By adding this to the bottom of my Application.h:

static struct InitializeStaticState
{
    InitializeStaticState();
} initializeStaticState;

And adding this to Application.cpp:

InitializeStaticState::InitializeStaticState()
{
    static bool initialized = false;
    if(!initialized)
    {
        initialized = true;
        Botan::mlock_allocator::instance();
        Botan::system_rng();
    }
}

We can ensure that the initializeStaticState object is always initialized before our own Application. Even if our Application is stored in a static smart pointer type (like std::unique_ptr) which has proven to be the most tricky case for us.

This is because anyone who stores a Application in a static variable will need to include the Application header first, and the header contains a static variable. The initialization order of statics within a single translation unit is always in order of declaration.,, so initializeStaticState will always be initialized first.

InitializeStaticState::InitializeStaticState() will initialize Botan's mlock_allocator and system_rng singletons. Since the order of destruction is always opposite of construction, my Application will always be destroyed before Botan's singletons.

apartridge avatar Aug 18 '21 15:08 apartridge

Sounds as if you're dealing with one of the many variants of the static initialization order fiasco. It's a bit insidious, as you'll think you've got it solved, then in a subsequent compile, something will change slightly, typically in the link phase, and you're back to the same issue again.

There are a number of solutions to the problem; one, for example, is to create a single static variable of a class or struct that contains all the things that you'd normally declare as static. Control the initialization order using that class, knowing that the destruction order is always the reverse of initialization order, and often you're in good shape.

We tend to use a more modular, pseudo-static solution, where modules that we'd normally treat as static singletons are encapsulated into a namespace that exposes an initialization function; the function, when run, returns an optional termination function. We hand everything we then want to initialize, in order, to a manager class that we instantiate in main, via an initializer_list of the initialization functions. That manager class runs the initializers and stacks up the termination functions that it optionally receives from the initializers; destruction of the manager class deals with running the termination functions in order.

Either one of these should get you into a well-defined state, and that should address the lifetime issues that are getting you here.

bazineta avatar Nov 29 '21 16:11 bazineta