log4cplus icon indicating copy to clipboard operation
log4cplus copied to clipboard

Unable to fork properly when built in multi-threaded mode

Open fbatogo opened this issue 4 years ago • 19 comments

Describe your problem here.

When attempting to fork, any child processes that are created appear to be able to log, but when the child processes terminate, they hang. I attempted to use the code suggested at https://stackoverflow.com/questions/14657891/what-happens-with-log4cplus-after-a-fork , but the child processes still hang. If I build the library in single threaded mode, then forking works properly, but if the app also threads, then there are unexpected crashes due to double frees or memory corruption.

In addition to that, I am unable to delay initialization as suggested in issue #365 as at least one app that I need to use this in starts up, writes some log lines, and then forks children to handle socket connections.

The following is some sample code I threw together to help me try to figure out the issue using test code that is as simple as I can get it.

#include <log4cplus/logger.h>
#include <log4cplus/loggingmacros.h>
#include <log4cplus/configurator.h>
#include <log4cplus/initializer.h>

#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

#include <iostream>

int main()
{
    pid_t childPid;
    pid_t resultPid;
    int status;
    int seconds = 1;

    log4cplus::BasicConfigurator::doConfigure();

    log4cplus::Logger logger = log4cplus::Logger::getInstance(
        LOG4CPLUS_TEXT("main"));

    LOG4CPLUS_WARN(logger, LOG4CPLUS_TEXT("Forking..."));

    log4cplus::Logger::shutdown();

    childPid = fork();

    log4cplus::BasicConfigurator::doConfigure();

    if (childPid < 0)
    {
        std::cerr << "Cannot fork.\n";
    }
    else if (childPid == 0)
    {
        LOG4CPLUS_WARN(logger, "Sleeping for " << seconds << " seconds...");
        sleep(seconds);
        LOG4CPLUS_WARN(logger, LOG4CPLUS_TEXT("Child terminating..."));
        exit(0);
    }
    else
    {
        std::cerr << "Waiting for child " << childPid << " to terminate..\n";
        resultPid = waitpid(childPid, &status, 0);
        std::cerr << "Waitpid returned " << resultPid << "\n";
        LOG4CPLUS_WARN(logger, LOG4CPLUS_TEXT("Child terminated..."));
    }
}

In the code above, "Waitpid returned ..." never shows up on the console, nor does the "Child terminated.." log line. Instead, the program just hangs. strace shows the child thread is blocked on a futex.

Are there any code changes, or configuration options that would get me a build that could be used in a program that both forks and threads?

Thanks!

Also, provide the following information:

  • version: 2.0.5
  • operating system, CPU, bitness: linux 4.15.0-54-generic, x86_64, 64-bit mode
  • configure script, or CMake, etc., flags and settings: Just a "./configure", no extra options.
  • flags and settings used by your application: None
  • compiler and its version: g++-7 (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0

fbatogo avatar Mar 10 '20 20:03 fbatogo

You must exec() right after fork() in the child.

wilx avatar Mar 10 '20 21:03 wilx

Unfortunately, this is legacy code that forks children to handle requests on sockets, among other things. So, calling exec() afterwards isn't an option. The code all currently works with a 1.2.x version of log4cplus, but I am attempting to upgrade it to a newer version that uses C++11.

Is there any way to build the single threaded version of the library so that it will work properly when my code is multi-threaded? Or some way to shut everything down and bring it back up after forking?

Thanks!

fbatogo avatar Mar 10 '20 21:03 fbatogo

You can compile log4cplus as single-threaded but then you cannot use it in multi-threaded process.

You can use multi-threaded log4cplus in forking process but you must exec(). Regardless of whether you use log4cplus or anything else, if your process spawns any other thread before you fork(), you must exec(). You can exec() self with custom parameters letting you know you have done just that and that the process should resume its function and handle connections.

wilx avatar Mar 10 '20 22:03 wilx

I am currently stuck with this same issue. I have found that the call to log4cplus::Logger::shutdown(); is not enough to shutdown the threads started by log4cplus. (leading to the deadlock later on on some mutex...) The log4cplus::Initializer class (which provides RAII de-initialization) additionnally calls shutdownThreadPool(); before Logger::shutdown ();

So I tried the following:

  1. disable the implicit initialization with ./configure --enable-implicit-initialization=no as described in #365
  2. Use the initializer class before and after the fork:
int main()
{
    {
    log4cplus::Initializer initializer;
    log4cplus::BasicConfigurator::doConfigure();

    log4cplus::Logger logger = log4cplus::Logger::getInstance(
        LOG4CPLUS_TEXT("main"));

    LOG4CPLUS_WARN(logger, LOG4CPLUS_TEXT("Forking..."));
    
    } // initializer destructor stops threads
    childPid = fork();
    {
    log4cplus::Initializer initializer;
    log4cplus::BasicConfigurator::doConfigure();

   // ........
   }
}

But sadly the Initializer class uses a singleton mechanism which prevents re-using it a second time to initialize. Could the Initializer be modified to allow for re-initializations after a de-initialization?

toesus avatar Jun 16 '20 12:06 toesus

I just found another related issue which contains an idea to make the ThreadPool restartable: If we use the proposed change from this comment: https://github.com/log4cplus/log4cplus/issues/423#issuecomment-506335182 and combine it with the following change in ~Initializer (commenting out the "destroy" mechanism) then the idea from my previous comment seems to work...

Initializer::~Initializer ()
{
    // bool destroy = false;
    {
        LOG4CPLUS_THREADED (
            std::unique_lock<std::mutex> guard (
                InitializerImpl::instance->mtx));
        --InitializerImpl::instance->count;
        if (InitializerImpl::instance->count == 0)
        {
            //destroy = true;
            deinitialize ();
        }
    }
    //if (destroy)
    //{
        //delete InitializerImpl::instance;
        //InitializerImpl::instance = nullptr;
    //}
}

Any thoughts? Are there some unitended consequences of this approach?

toesus avatar Jun 16 '20 12:06 toesus

As I have written elsewhere, you must exec after you have forked. There is no other way. You cannot just re-initialize log4cplus like if (fork()==0){ /* reinit */ }. If this is what you want to do.

wilx avatar Jun 18 '20 10:06 wilx

I have the same problem: the child hangs at exit. It is quite clear why: the destructor of ThreadPool notifies some condition variables and then waits on another condition variable until the worker pool is empty. However threads are not inherited in the child process so basically we have a vector of std::thread in the child but no real OS threads in it. There's no thread in the child blocked at the conditions that are notified. So notifying in ~ThreadPool does not trigger the emptying of the worker vector as it would in the parent. Thus, we block forever at the condition waiting for the thread vector to be empty.

I think that requiring that all applications that fork and want to use log4cplus must run exec after fork is a severe limitation of the library. It means that it is unusable in contexts in which we develop plugins that are dlopened in existing multiprocess applications, in which we have no control over fork and exec.

It appears to me that the 1.x version of log4cplus is usable in multiprocess multithreaded applications without needing to run exec after fork. In this sense the imposed limitation is a regression.

smanolache avatar Dec 01 '20 14:12 smanolache

This is not a limitation of the library. This is a limitation of the POSIX model of things.

wilx avatar Dec 01 '20 14:12 wilx

However 1.1.9 worked with the same POSIX model.

smanolache avatar Dec 01 '20 15:12 smanolache

Well, no. The only situation, when it worked better in 1.x with respect to the forking, is when you never create any threads in your application.

Version 2.x does start the thread pool and that does interfere very visibly with the forking. But even 1.x could start threads but much fewer. In 1.x, there is AsyncAppender if you use it, ConfigureAndWatchThread also starts a thread. If you use these and you just fork without exec you can get into same trouble.

You can still compile and run the library in trully non-threaded mode. The LOG4CPLUS_SINGLE_THREADED preprocessor symbols is still recognized and it is still used when you configure with --disable-threads.

wilx avatar Dec 01 '20 15:12 wilx

I am tempted to provide a setting that would disable the thread pool entirely, not even compile the source for it. But that does not solve anything with respoect to threading and forking. As long as you have any threads in your application anywhere, you simply must exec after fork. I cannot stress this enough.

wilx avatar Dec 01 '20 15:12 wilx

When you say "create/have any threads in your application" then I suppose you mean log4cplus threads. Because I can create any application threads I want, and fork after that and have no trouble if I know that I don't have the threads in my child.

Even log4cplus would not hang in the child process if it didn't wait on the thread pool to empty itself (which it cannot, because there are no threads). (But even if somehow log4cplus would avoid destructing the thread pool or change its logic in order to avoid waiting on the condition variable, I suppose log4cplus would simply not work in the child process because it would rely on the existence of the worker threads in the thread pool and they are simply not there.)

When I used the 1.1.x version, I had plenty of application threads but I had no log4cplus thread because I didn't use the AsyncAppender. The ConfigureAndWatchThread exists in the 1.2.0 code but it is not instantiated anywhere in the library. So if I don't use it in my application (which I didn't) it wouldn't create threads.

Questions: If I configure with --disable-threads can I still use the library in a multithreaded application, i.e. are its methods thread-safe? --disable-threads would remove only the creation of library-specific threads or would it also remove all syncrhonisation mechanisms from its functions? Could you separate these two issues, such that the library could be mono-threaded (would create no threads of its own) but the functions would be thread-safe such that the mono-threaded library can be used in a multi-threaded application?

smanolache avatar Dec 01 '20 16:12 smanolache

When you say "create/have any threads in your application" then I suppose you mean log4cplus threads. Because I can create any application threads I want, and fork after that and have no trouble if I know that I don't have the threads in my child.

No, I mean literally any threads. More then one thread in a process ⇒ only safe option is to fork and exec.

If I configure with --disable-threads can I still use the library in a multithreaded application, i.e. are its methods thread-safe?

No. This disables any locking as well.

wilx avatar Dec 01 '20 16:12 wilx

When you say "create/have any threads in your application" then I suppose you mean log4cplus threads. Because I can create any application threads I want, and fork after that and have no trouble if I know that I don't have the threads in my child.

No, I mean literally any threads. More then one thread in a process ⇒ only safe option is to fork and exec.

That is not true.

If I configure with --disable-threads can I still use the library in a multithreaded application, i.e. are its methods thread-safe?

No. This disables any locking as well.

Could you separate these two? log4cplus threads go, locking stays?

smanolache avatar Dec 01 '20 17:12 smanolache

That is not true.

It is actually true as far as I know.

Could you separate these two? log4cplus threads go, locking stays?

I have added an option to disable the thread pool. See PR #491.

wilx avatar Dec 01 '20 17:12 wilx

Hi, I am facing same issue of hang during child process terminate. I thought with log4cplus 2.0.6 complete threadpool will be disabled but still see hang part at threadpool destructor. So the thread hang during terminate is still there with log4cplus 2.0.6?

Harika-Bavana avatar Apr 16 '21 10:04 Harika-Bavana

@Harika-Bavana You have the option to disable the thread pool. It is still enabled by default.

wilx avatar Apr 16 '21 10:04 wilx

Would it be possibly to lazily initialize the thread pool? For distirbuting the package in Debian its a bit limiting to do that at build-time only, as I can hardly compile every flavour...

coldtobi avatar Aug 25 '21 18:08 coldtobi

As a last-ditch effort you can force the child process to terminate with raise(SIGKILL). It may not be kosher but it worked for me. Just don't forget #include <signal.h>

Atomic-Johnson avatar May 25 '22 18:05 Atomic-Johnson

I have implemented the thread pool on demand initialization.

wilx avatar Feb 09 '23 19:02 wilx