EnableLogCleaner() interface is not thread safe?
I read the functions of EnableLogCleaner,DisableLogCleaner and LogCleaner::enabled().
I find the private member enabled_ of LogCleaner is not lock guard.
Once one thread call EnableLogCleaner,and other thread is logging, It may lead to undefined behaviour?
Could you please point out the exact location where the problem occurs?
Could you please point out the exact location where the problem occurs?
When one thread is logging to file, it will call the function LogFileObject::Write(). The function LogFileObject::Write() will call log_cleaner.enabled(). The log_cleaner.enabled() visits the enabled_ without lock guarded.
meanwhile other thread try to change the enabled_ by call function EnableLogCleaner or DisableLogCleaner which will write the enabled_ without lock guarded.
so may be it lead to undefined behaviour? I am just in doubt of this. May be I am wrong.
Thanks for response.
Is it possible that you mean a race condition and not undefined behavior? The former can occur because EnableLogCleaner and DisableLogCleaner are not atomic. However, I believe the race condition is tolerable because at most what can happen are missed cleanups which are made up for once EnableLogCleaner is called again.
cc @aesophor
Is it possible that you mean a race condition and not undefined behavior? The former can occur because
EnableLogCleanerandDisableLogCleanerare not atomic. However, I believe the race condition is tolerable because at most what can happen are missed cleanups which are made up for onceEnableLogCleaneris called again.cc @aesophor
Yes.
One thread is reading enabled_ while another is writing it. May the race condition leads to undefined behavior ?
It must not leads to undefined behavior because enabled_ is bool ?
I'm not sure what are you asking. Just because a boolean is not atomic it will not immediately cause undefined behavior. Also, undefined behavior is generally not prescribed by a library but by the language standard.
I'm not sure what are you asking. Just because a boolean is not atomic it will not immediately cause undefined behavior. Also, undefined behavior is generally not prescribed by a library but by the language standard.
OK. If the enabled_ is uint8_t, the race condition probably cause undefined behavior?
Quoting from @sergiud:
I believe the race condition is tolerable because at most what can happen are missed cleanups which are made up for once EnableLogCleaner is called again.
@kimi20071025 We understand the possible race condition you've mentioned. Are you working on something that can trigger this race condition and is intolerable?
IMO, if we want to eliminate race condition completely, just making enabled_ an std::atomicLogCleaner::Enable(int days), the value of LogCleaner::overdue_days_ at the end can be indeterministic. However, if we use a mutex to protect the entire body of both LogCleaner::Enable() and LogCleaner::enabled(), then they will bring overhead for those who use this API with a single thread.
In this case, is there a better option other than a mutex?
Quoting from @sergiud:
I believe the race condition is tolerable because at most what can happen are missed cleanups which are made up for once EnableLogCleaner is called again.
@kimi20071025 We understand the possible race condition you've mentioned. Are you working on something that can trigger this race condition and is intolerable?
IMO, if we want to eliminate race condition completely, just making
enabled_an std::atomic is probably not enough: Imaging two threads callingLogCleaner::Enable(int days), the value ofLogCleaner::overdue_days_at the end can be indeterministic. However, if we use a mutex to protect the entire body of bothLogCleaner::Enable()andLogCleaner::enabled(), then they will bring overhead for those who use this API with a single thread.In this case, is there a better option other than a mutex?
What I am worried about is not race condition but data race.
Glog library try to clean overdue logs every time when logging. I think it may cost much performance If LogCleaner is opened all the time. So Once a day, I enable log cleaner (EnableLogCleaner ) at night , do an error logging and close log cleaner (DisableLogCleaner ).
When a thread is logging, it will finally call LogFileObject::Write, the function LogFileObject::Write calls log_cleaner.enabled().
which means the thread reads enabled_ .
Meanwhile, other thread calls EnableLogCleaner or DisableLogCleaner , the thread writes enabled_ .
So what I worried is the condition may cause data race .
I had to actually look this up. But it seems that concurrent reading/writing of/to a variable is indeed undefined behavior in C++.
However, at this point I would rather claim that your use case is not supported in the sense that you should not be calling EnableLogCleaner and DisableLogCleaner from different threads. As @aesophor already mentioned, a mutex is necessary to solve the data race which will however incur an overhead at little or no gain. We can certainly reconsider once we have the means for proper synchronization (i.e., C++11 <thread> functionality). But at this point, the effort making these functions thread safe is not worth it.
I think it may cost much performance
How did you determine the performance penalty, or is this only a guess? If it's the latter, you should measure quantitatively first and let us know how strong the performance degradation is if the cleaner has been enabled. This is something worth improving.
About the log cleaner performance, I shared the same opnion with issue #702 . Much performance cost when enable log clean.
And I also read the benchmark #761 , could @aesophor help to make other benchmark comparing the logging performance between enable log clean with disable log clean?
If there is much performance penalty comparing by disable log clean. It means the LogCleaner can not be enabled all the time.
Platform:
- Arch Linux
- CPU: Intel i7-10700 CPU @ 2.90GH (16-core)
- Memory: 16 GB ...
When there are 100 files in log directories, the execution time of
LogCleaner::Run()is 0.157 ms in avg.
Upon each call to LogFileObject::write(), if the log cleaner is enabled and if there are 100 files to check, it only takes extra 0.157 ms to run.
About the log cleaner performance, I shared the same opnion with issue #702 . Much performance cost when enable log clean.
What's exactly "much performance" in absolute numbers? Issue #702 did not provide any details either. Therefore I don't see any factual arguments.
Platform
- Arch:linux
- CPU:Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz(4 cores)
- Memory: 4 GB
Test code
#include "glog/logging.h"
#include <unistd.h>
#include <chrono>
#include <iostream>
using namespace std;
using namespace chrono;
int main(int argc, char *argv[])
{
google::InitGoogleLogging("");
FLAGS_logbufsecs = 0;
FLAGS_max_log_size = 100;
google::SetLogDestination(google::GLOG_ERROR, "/tmp/glog/ERROR_");
google::SetLogDestination(google::GLOG_WARNING, "");
google::SetLogDestination(google::GLOG_INFO, "");
FLAGS_colorlogtostderr = false;
FLAGS_alsologtostderr = false;
google::SetStderrLogging(google::GLOG_FATAL);
google::EnableLogCleaner(1);
auto enableLogCleanStart = system_clock::now();
for (uint16_t i = 0; i < 50000; i++)
{
LOG(ERROR)<<"glog error";
}
auto enableLogCleanEnd = system_clock::now();
std::chrono::duration<double, std::milli> enableLogCleanDuration = enableLogCleanEnd - enableLogCleanStart;
cout<<"cost: "<<enableLogCleanDuration.count()<<" ms when enable log clean"<<endl;
google::DisableLogCleaner();
auto disableLogCleanStart = system_clock::now();
for (uint16_t i = 0; i < 50000; i++)
{
LOG(ERROR)<<"glog error";
}
auto disableLogCleanEnd = system_clock::now();
std::chrono::duration<double, std::milli> disableLogCleanDuration = disableLogCleanEnd - disableLogCleanStart;
cout<<"cost: "<<disableLogCleanDuration.count()<<" ms when disable log clean"<<endl;
return 0;
}
Result
cost: 518.541 ms when enable log clean
cost: 225.845 ms when disable log clean
conclusion
When there are 1 file in log directories and log cleaner is opened, the log performance will decrease 230% comparing with log cleaner is closed.
I believe that the overhead (518.541 ms) is because:
- you write 50,000 times, each time triggers the log cleaner.
- each time the log cleaner takes approx 0.01037 ms to run.
- 518.541 ms / 50000 = 0.01037 ms
If your application needs to write so frequently but it's performance critical at the same time, then you can use another thread to occasionally perform cleanup. You are also welcome to submit a PR and improve it.
A possible optimization strategy: use CycleClock_Now() to maintain an additional timer such as next_cleanup_time_ .
you can use another thread to occasionally perform cleanup
This is what I already do, But the date race of enabled_ should be avoid as I said at the begin of this issue.