Fixed race condition around `s_sysTraceThread`
There's a subtle race condition when threads read s_sysTraceThread.
Since both the main thread (executing ~Profile()) and the Profiler::Worker() thread call StopSystemTracing(), the condition:
if (s_sysTraceThread) {
...
}
within StopSystemTracing() is unsynchronized because the declaration of s_sysTraceThread is not atomic:
static Thread* s_sysTraceThread = nullptr;
Fixes issue https://github.com/wolfpld/tracy/issues/1114
Now, a more fundamental question:
This race only exists because two threads end up calling StopSystemTracing() in a relatively short span of time.
If instead we just called StopSystemTracing() from the main thread (in ~Profiler()), we would not need to make s_sysTraceThread atomic.
Is there a reason to call StopSystemTracing() during the shutdown logic of Profile::Worker()?
The worker enters the shutdown stage in response to the main thread signaling the worker to shutdown, and this happens inside ~Profile() as well.
The commit introducing the two calls is 46a5cc4, and the related PR is #1016.
Coding style needs to be fixed.
The commit introducing the two calls is 46a5cc4, and the related PR is #1016.
Hmm. Unless there's a way for Profiler::Worker() to break from its loop before the ~Profiler() destructor runs, I don't see why the StopSystemTracing() would be necessary in Profiler::Worker(). The system tracing would have already been stopped by the main thread by the time the worker thread calls it.
It is possible that ETW's ProcessTrace() may still be pending processing whatever events are left in the buffer when CloseTrace() was called, but it would not lead to events being posted to the Worker() queue with no end.
It looks like @6yry6e fixed a real issue with https://github.com/wolfpld/tracy/pull/1016, but some leftover from the investigation ended up landing with the PR. @6yry6e any insights? (asking because if the latter call to StopSystemTracing() is not necessary, we can remove the atomic logic and simplify things.)
When one request manual shutdown via RequestShutdown worker would break from the loop here https://github.com/wolfpld/tracy/blob/52bfac424346f00288a5460ccb06c4459204d9f0/public/client/TracyProfiler.cpp#L2085
Willing to send the rest of the data.
Profiler won't be destoyed at that moment, as it's a static variable and will live, while app is looping in
while (!tracy::GetProfiler().HasShutdonwFinished())
Then we need to stop system tracing in a worker thread manually to stop filling queue with new data
But if the app doesn't wait or is terminated by any reason, the crash may occur, in a race, that's right :(.
@6yry6e Thanks for the insight, I missed that detail.
It looks like we should consider revisiting/refactoring the shutdown protocol as a whole in the future. @wolfpld I'll add some comments to the code and then I think we're good to merge this PR.