dynamorio [perf] remove global lock from signal pending path

The thread_initexit_lock is a contention hog for signal handling:

Thread 685 (Thread 0x7f0ca95b4700 (LWP 95391)):
#0  0x00007f0cabc7a47e in syscall_ready ()
#1  0x00007f0cc4842280 in thread_in_DR_exclusion ()
#2  0x00007f0cabd17584 in ksynch_wait (futex=futex@entry=0x6, mustbe=mustbe@entry=1) at dynamorio/trunk/core/unix/ksynch_linux.c:120
#3  0x00007f0cabd24fed in mutex_wait_contended_lock (lock=0x6) at dynamorio/trunk/core/unix/os.c:8882
#4  0x00007f0cabd2f5e6 in mutex_lock (lock=<optimized out>) at dynamorio/trunk/core/utils.c:888
#5  0x00007f0cabd29415 in translate_sigcontext (dcontext=dcontext@entry=0x7f0c44962ac0, uc=uc@entry=0x7f0c44ca6ac0, avoid_failure=<optimized out>, f=f@entry=0x7f0c44c8d7e8)
    at dynamorio/trunk/core/unix/signal.c:2179
#6  0x00007f0cabd2a1db in record_pending_signal (dcontext=dcontext@entry=0x7f0c44962ac0, sig=sig@entry=27, ucxt=ucxt@entry=0x7f0c44ca6ac0, frame=frame@entry=0x7f0c44ca6ab8, 
    forged=forged@entry=0 '\000', access_address=0x0) at dynamorio/trunk/core/unix/signal.c:3650
#7  0x00007f0cabd2ba54 in master_signal_handler_C (sig=<optimized out>, siginfo=<optimized out>, ucxt=0x7f0c44ca6ac0, xsp=0x7f0c44ca6ab8 "\024\240ǫ\f\177")
    at dynamorio/trunk/core/unix/signal.c:4526

Dec 14 '16 19:12 derekbruening

One option is to try to use enter_couldbelinking instead of the thread_initexit_lock. However, it can end up flushing the interrupted fragment from underneath us. I think I can get that to work w/o a client, but it messes up the pre-xl8 cxt for clients.

Another question is why this alarm signal here was being delivered now, as this was with -code_api where there are no inlined syscalls.

Oct 18 '17 02:10 derekbruening

Another source of lock contention in delivering signals (lumping into this same issue) is all_memory_areas->lock when copying the frame to the stack:

#2  0x000055ddcc109af0 in memquery_iterator_next ()  <-- memory_info_buf_lock
#3  0x000055ddcc109d10 in memquery_from_os ()
#4  0x000055ddcc114dc7 in get_memory_info_from_os ()
#5  0x000055ddcc1090aa in memcache_query_memory ()   <-- all_memory_areas->lock
#6  0x000055ddcc111b2f in get_memory_info ()
#7  0x000055ddcc11f53f in copy_frame_to_stack ()
#8  0x000055ddcc11dd1a in receive_pending_signal ()
#9  0x000055ddcc0cbd52 in dispatch ()

Mar 01 '18 20:03 derekbruening

Did PR #2872 fix this issue? From a quick reading it seems there's still more work left for reducing lock contention.

Feb 08 '22 16:02 abhinav92003

I have a local branch for this which I renamed long ago to i2114-signal-lock-not-sure-worthwhile. My local commit message says:

    i#2114: reduce signal pending lock contention
    
    NOCHECKIN: unclear we need it; has risks; maybe use global lock for cases
    where can't fail, local o/w?
    
    NOCHECKIN: client pre-xl8 is main issue
    
    For pending signals arriving at syscalls, we translate immediately, using
    the global thread_initexit_lock.  This can be a point of contention for
    timer signals.  We swap to using enter_couldbelinking instead.
    
    Fixes #2114

Feb 08 '22 16:02 derekbruening

dynamorio dynamorio copied to clipboard

[perf] remove global lock from signal pending path

dynamorio
dynamorio copied to clipboard