trafficserver icon indicating copy to clipboard operation
trafficserver copied to clipboard

Replace ink_atomic with std::atomic

Open masaori335 opened this issue 2 years ago • 3 comments

Last few days, I've been running ThreadSanitizer. Some reports has the same pattern of data race, atomic write v.s. normal read.

The ink_atomic operations are used for writing variables but not for reading variables. It looks like this happens to almost all of the variables that are touched by ink_atomic operations. We need to change these variables with std::atomic to make reading safe.

Also, replacing with std::atomic provides an advantage. We can control memory order. This is not available with ink_atomic because it's using legacy gcc buit-in __sync functions, I think.

  • e.g.
WARNING: ThreadSanitizer: data race (pid=41543)
  Read of size 8 at 0x7b3000000390 by thread T21 (mutexes: write M0):
    #0 LogObjectManager::check_buffer_expiration(long) LogObject.cc:1146 (traffic_server:x86_64+0x28eeaf) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #1 Log::periodic_tasks(long) Log.cc:233 (traffic_server:x86_64+0x24a7e4) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #2 Log::flush_thread_main(void*) Log.cc:1390 (traffic_server:x86_64+0x255273) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #3 LoggingFlushContinuation::mainEvent(int, void*) Log.cc:282 (traffic_server:x86_64+0x255cf3) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #4 EThread::execute() UnixEThread.cc:333 (traffic_server:x86_64+0x520124) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #5 spawn_thread_internal(void*) Thread.cc:79 (traffic_server:x86_64+0x51d488) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)

  Previous atomic write of size 8 at 0x7b3000000390 by thread T11 (mutexes: write M1, write M2):
    #0 LogObject::_checkout_write(unsigned long*, unsigned long) LogObject.cc:417 (traffic_server:x86_64+0x28a54d) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #1 LogObject::log(LogAccess*, std::__1::basic_string_view<char, std::__1::char_traits<char> >) LogObject.cc:648 (traffic_server:x86_64+0x28afd4) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #2 LogObjectManager::log(LogAccess*) LogObject.cc:1350 (traffic_server:x86_64+0x292fee) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #3 Log::access(LogAccess*) Log.cc:1178 (traffic_server:x86_64+0x253683) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #4 HttpSM::kill_this() HttpSM.cc:7185 (traffic_server:x86_64+0xdc1b4) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #5 HttpSM::main_handler(int, void*) HttpSM.cc:2720 (traffic_server:x86_64+0xd712e) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #6 HttpTunnel::main_handler(int, void*) HttpTunnel.cc:1684 (traffic_server:x86_64+0x1a1437) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #7 CacheVC::calluser(int) P_CacheInternal.h:636 (traffic_server:x86_64+0x3b348a) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #8 CacheVC::openWriteMain(int, Event*) CacheWrite.cc:1441 (traffic_server:x86_64+0x3b2dc2) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #9 EThread::process_event(Event*, int) UnixEThread.cc:153 (traffic_server:x86_64+0x51e8fa) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #10 EThread::process_queue(Queue<Event, Event::Link_link>*, int*, int*) UnixEThread.cc:188 (traffic_server:x86_64+0x51f01f) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #11 EThread::execute_regular() UnixEThread.cc:244 (traffic_server:x86_64+0x51f831) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #12 EThread::execute() UnixEThread.cc:349 (traffic_server:x86_64+0x5202e9) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #13 spawn_thread_internal(void*) Thread.cc:79 (traffic_server:x86_64+0x51d488) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)

  Location is heap block of size 192 at 0x7b3000000300 allocated by main thread:
    #0 operator new(unsigned long) <null>:28420041 (libclang_rt.tsan_osx_dynamic.dylib:x86_64h+0x652ad) (BuildId: 065b6c81cd87343dafd65694f305ecb72400000010000000000a0a0000030c00)
    #1 YamlLogConfig::decodeLogObject(YAML::Node const&) YamlLogConfig.cc:225 (traffic_server:x86_64+0x29ff09) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #2 YamlLogConfig::loadLogConfig(char const*) YamlLogConfig.cc:100 (traffic_server:x86_64+0x29d4eb) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #3 YamlLogConfig::parse(char const*) YamlLogConfig.cc:39 (traffic_server:x86_64+0x29b8f0) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #4 LogConfig::evaluate_config() LogConfig.cc:856 (traffic_server:x86_64+0x26a72a) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #5 LogConfig::setup_log_objects() LogConfig.cc:427 (traffic_server:x86_64+0x269e7d) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #6 LogConfig::init(LogConfig*) LogConfig.cc:327 (traffic_server:x86_64+0x26711d) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #7 Log::init(int) Log.cc:1066 (traffic_server:x86_64+0x2525f3) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #8 main traffic_server.cc:2097 (traffic_server:x86_64+0x6d066) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
WARNING: ThreadSanitizer: data race (pid=41517)
  Atomic write of size 8 at 0x00000a05bb90 by main thread (mutexes: write M0, write M1, write M2, write M3, write M4, write M5, write M6, write M7, write M8, write M9, write M10, write M11, write M12, write M13):
    #0 ink_atomiclist_push ink_queue.cc:538 (libtscore.10.dylib:x86_64+0x4522d) (BuildId: ddb6086f2f64376aae6ba04a59161d1132000000200000000100000000000c00)
    #1 ProtectedQueue::enqueue(Event*) ProtectedQueue.cc:52 (traffic_server:x86_64+0x51c12d) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #2 EventProcessor::schedule(Event*, int) P_UnixEventProcessor.h:123 (traffic_server:x86_64+0x322d9d) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #3 RecProcessStart() RecProcess.cc:277 (traffic_server:x86_64+0x517e26) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #4 main traffic_server.cc:2072 (traffic_server:x86_64+0x6cef8) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)

  Previous read of size 8 at 0x00000a05bb90 by thread T18 (mutexes: write M14):
    #0 ProtectedQueue::wait(long long) ProtectedQueue.cc:94 (traffic_server:x86_64+0x51c435) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #1 EThread::DefaultTailHandler::waitForActivity(long long) I_EThread.h:369 (traffic_server:x86_64+0x521182) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #2 EThread::execute_regular() UnixEThread.cc:288 (traffic_server:x86_64+0x51fe17) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #3 EThread::execute() UnixEThread.cc:349 (traffic_server:x86_64+0x5202e9) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #4 spawn_thread_internal(void*) Thread.cc:79 (traffic_server:x86_64+0x51d488) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)

  Location is heap block of size 1101120 at 0x000009f5b000 allocated by main thread:
    #0 operator new(unsigned long) <null>:28420041 (libclang_rt.tsan_osx_dynamic.dylib:x86_64h+0x652ad) (BuildId: 065b6c81cd87343dafd65694f305ecb72400000010000000000a0a0000030c00)
    #1 EventProcessor::spawn_event_threads(int, int, unsigned long) UnixEventProcessor.cc:421 (traffic_server:x86_64+0x52494b) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #2 TasksProcessor::start(int, unsigned long) Tasks.cc:42 (traffic_server:x86_64+0x51cac0) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #3 main traffic_server.cc:2070 (traffic_server:x86_64+0x6cef3) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
WARNING: ThreadSanitizer: data race (pid=42272)
  Atomic write of size 8 at 0x000001768418 by thread T30:
    #0 RecIncrGlobalRawStatSum(RecRawStatBlock*, int, long long) RecRawStats.cc:439 (traffic_server:x86_64+0x514826) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #1 NetAccept::do_blocking_accept(EThread*) UnixNetAccept.cc:335 (traffic_server:x86_64+0x4d93d0) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #2 NetAccept::acceptLoopEvent(int, Event*) UnixNetAccept.cc:551 (traffic_server:x86_64+0x4d833a) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #3 EThread::execute() UnixEThread.cc:333 (traffic_server:x86_64+0x520124) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #4 spawn_thread_internal(void*) Thread.cc:79 (traffic_server:x86_64+0x51d488) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)

  Previous read of size 8 at 0x000001768418 by thread T19 (mutexes: write M0, write M1, write M2):
    #0 RecRawStatSyncSum(char const*, RecDataT, RecData*, RecRawStatBlock*, int) RecRawStats.cc:301 (traffic_server:x86_64+0x512bf8) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #1 RecExecRawStatSyncCbs() RecRawStats.cc:563 (traffic_server:x86_64+0x514f57) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #2 raw_stat_sync_cont::exec_callbacks(int, Event*) RecProcess.cc:146 (traffic_server:x86_64+0x518a57) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #3 EThread::process_event(Event*, int) UnixEThread.cc:153 (traffic_server:x86_64+0x51e8fa) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #4 EThread::execute_regular() UnixEThread.cc:258 (traffic_server:x86_64+0x51f8af) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #5 EThread::execute() UnixEThread.cc:349 (traffic_server:x86_64+0x5202e9) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #6 spawn_thread_internal(void*) Thread.cc:79 (traffic_server:x86_64+0x51d488) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)

  Location is heap block of size 520000 at 0x000001736000 allocated by main thread:
    #0 malloc <null>:28420041 (libclang_rt.tsan_osx_dynamic.dylib:x86_64h+0x3732c) (BuildId: 065b6c81cd87343dafd65694f305ecb72400000010000000000a0a0000030c00)
    #1 ats_malloc ink_memory.cc:64 (libtscore.10.dylib:x86_64+0x4381f) (BuildId: ddb6086f2f64376aae6ba04a59161d1132000000200000000100000000000c00)
    #2 RecCoreInit(RecModeT, Diags*) RecCore.cc:206 (traffic_server:x86_64+0x503e76) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #3 RecProcessInit(RecModeT, Diags*) RecProcess.cc:214 (traffic_server:x86_64+0x51742d) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)
    #4 main traffic_server.cc:1783 (traffic_server:x86_64+0x6afa3) (BuildId: 279497f9a0ac37109783e3c337bc1aa632000000200000000100000000000c00)

Actually, Phil tried to replace ink_atomic with std::atomic by PR #2633 5 years ago.

masaori335 avatar Nov 02 '22 05:11 masaori335

We should use std::atomic generally, because it's standard. The current atomic support dates from before any of this was standard.

SolidWallOfCode avatar Nov 08 '22 00:11 SolidWallOfCode

Just to record this: in discussing this in our ASF PR/issue meeting, we suggest that this be addressed piecemeal as patches are done involving the old ink_atomic code.

bneradt avatar Nov 08 '22 00:11 bneradt

This issue has been automatically marked as stale because it has not had recent activity. Marking it stale to flag it for further consideration by the community.

github-actions[bot] avatar Nov 08 '23 01:11 github-actions[bot]