dynamorio
dynamorio copied to clipboard
handle sigtimedwait and sigwaitinfo
From [email protected] on June 13, 2013 11:10:46
xref issue #1138 two more signal syscalls to handle
Original issue: http://code.google.com/p/dynamorio/issues/detail?id=1188
Xref #92
Xref other signal syscalls we're missing support for: #1139, #2759
This came into my radar while adding support for some new Linux system calls for #5131, of which rt_sigtimedwait_time64 is one.
sigwaitinfo and sigtimedwait wait for one of the given signals to become pending (not necessarily blocked), and when one does, they return it to the caller, alongwith a siginfo_t.
This can be potentially problematic for DR; if we don't handle this syscall, a signal intended for DR may be returned to the app in this manner. In this case, DR actually needs to invoke its own signal handler to process the signal, and then enter the syscall again so that we can wait for a signal that was meant for the app.
Handling of these syscalls is also complicated by the fact that they "removes the signal from the set of pending signal" (https://man7.org/linux/man-pages/man2/sigwaitinfo.2.html), which means that the signal handler is not called for the returned signal. So, if we already have a pending signal upon entering the syscall, we need to remove the pending signal from DR's records without calling the signal handler, and return that to the app.
I'm surprised that not handling these hasn't caused problems yet. They seem important. @derekbruening are these syscalls not common?
@derekbruening are these syscalls not common?
They don't seem to have caused problems. They do print a (debug-build) warning. The only place I know we've seen them is #2465.
They are very common in modern mobile applications and games, TikTok or something like that (#2465 fits into the same category).
For a signal number for which DR does not need to take any action, these syscalls should not cause any problems. I believe that's why we haven't seen any issues from them: the app would have to wait on SIGUSR2 (with older DR: now it's SIGSTKFLT) to cause a synchall problem, or on one of the 3 alarm signals if DR or a client has an active itimer, or on SIGSEGV or SIGBUS: but those would have to be sent with SIGKILL to the waiing thread and in such a case DR wouldn't need to intercede. So the chance of a problem is low. However, we're seeing this syscall in use in more apps we're looking at, such as mysql, even if they don't currently wait on problematic signals: so we should try to add handling to cover the corner cases just in case.
Syscall does not change the mask: caller would normally do that ahead of time (and in all other threads too).
If DR doesn't care about the signal: do nothing.
If DR does care:
- Turn into nop for that signal by removing from set (confirm syscall still waits if set is empty: it does on glaptop)
- If that signal arrives in that thread in DR's handler, it will have
interrupted the wait syscall for us.
- We can't inline the syscall: it must end a block, so we don't go run more code for an asynch signal: we need to pretned it wasn't interrupted.
- If we're going to deliver the signal, instead of constructing a signal frame and invoking the app's handler, we need to set the return value and fill in the siginfo.
- If we're not going to deliver the signal, the syscall should already have an EINTR return value: so we don't have to do anything as the app should loop on EINTR.
So, if we already have a pending signal upon entering the syscall, we need to remove the pending signal from DR's records without calling the signal handler, and return that to the app.
I think this is saying the same thing as the delivery scheme in the prior comment.
I don't think DR needs to do anything special anywhere else: a pending signal that becomes deliverable will only happen on an unmasking syscall or an asynch signal where we waited to exit the cache.