dynamorio
dynamorio copied to clipboard
add a client API for internally-triggered detach from dstack
Today our detach has some limitations: it can only be triggered from a non-DR stack: an app stack via the start/stop API, or an injected thread stack via a nudge on Windows
This issue covers a feature where a client can call a new APi routine dr_detach_process() from the dstack. We could do a raw mmap to make a stack, copy the current thread's app state there, switch to that stack, call detach_on_permanent_stack(), and then swap to the stored state. We'd just leak the mmap I guess. And no support for unloading the DR lib.
Xref #95
One way to do this today is for a client to use drwrap_replace_native() on some app function and invoke dr_app_stop_and_cleanup() from the replacement, which will be on the app stack.
There was a comment added here asking how to implement this feature (received email notification) to which I was going to reply: but now it is not visible here, presumably b/c it was deleted?
correct, I re-read your answers and your answers to some posts on the google group and figured out a solution. I'm leaving a simplistic version of my implementation here for you to comment on and for future interested readers:
#define EXIT_FUNCTION "MyClass::MyExitFunc"
static void
detach_client()
{
dr_app_stop_and_cleanup();
}
static void
event_module_load(void* drcontext, const module_data_t* data, bool loaded)
{
// ...
app_pc to_replace = NULL;
/* Try exported functions first. */
to_replace = (app_pc)dr_get_proc_address(data->handle, EXIT_FUNCTION);
if (!to_replace) {
/* Then query all symbols inside module. */
drsym_init(0);
drsym_error_t error = drsym_lookup_symbol(data->full_path, EXIT_FUNCTION, (size_t*)(&to_replace), 0);
drsym_exit();
}
if (to_replace) {
to_replace += (size_t)data->start;
drwrap_replace_native(
to_replace,
(app_pc)detach_client,
true,
0,
NULL,
true
);
}
// ...
}
Thank you for the post. The only thing I see is that the app function being replaced is skipped and not executed (and unfortunately DR does not provide an API to make it easy to call an app function to remedy the situation; xref #758).
Late follow-up on this: I initially thought that this would work as it did trigger the exit handler of my DR client and therefore the behavior was as expected from the client's perspective.
However, calling dr_app_stop_and_cleanup() like I've done it here will terminate the app process with a non-zero exit code. Here's an excerpt from the log where I traced a C++ unit test binary (drrun.exe -debug -loglevel 2 -c detach-client.dll -- C:\...\unittests.exe, happens without -debug as well):
DR client received detach, calling dr_app_stop_and_cleanup()
<Detaching from application C:\...\unittests.exe (11616)>
<Detaching from process, entering final cleanup>
DR client received exit
unknown file: error: SEH exception with code 0xc0000005 thrown in auxiliary test code (environments or event listeners).
If I understand correctly, the access violation is triggered after DR has lost control (I did not call drwrap_replace_native_fini(dr_get_current_drcontext()) in detach_client()). When running the app without DR everything works fine.
Do you have any ideas what could be the issue here or any suggestion on how I could further debug the issue?
Examine the 0xc0000005 in the debugger to get the basic info on where the crash is happening.
If we create a separate-stack mechanism here, we could use the same thing on a nudge to add nudge-based detach on UNIX (presumably under #95).