drakvuf icon indicating copy to clipboard operation
drakvuf copied to clipboard

Repetitive "failed to lookup page info" messages (ipt.cpp / memdump.cpp)

Open h4b4n3r0 opened this issue 4 years ago • 5 comments

I reuse some code of the ipt / memdump plugin (@icedevml) and encountered a strange problem, when monitoring firefox. To ensure this is not a problem of my code, I also tested this with the current state of the ipt plugin (22fe0b). I only looked at the methods mm_access_fault_hook_cb mm_access_fault_return_hook_cb which are responsible for the described behaviour:

The drakvuf version itself, which I am using is from 61c72ce from 27/10/2020. I am going to forward my plugin soon and try it with the latest version as well.

What happens: When I start firefox and use the -v drakvuf flag for debug output, at some point I see many consecutive lines of https://github.com/tklengyel/drakvuf/blob/af5f81ad434890f89024a23e5e510c8210a95ca8/src/plugins/ipt/ipt.cpp#L298

I had a closer look and detected that in each block if these messages there is one more of these lines, so at some point I was at 2000 consecutive of these messages. I was not able to startup firefox (I am also at my old slow PC) since there were too many calls of mm_access_fault_return_hook_cb.

What should happen: This is just a guess, but in my opinion this message should be just printed once?

My current explanation: I had a closer look at mm_access_fault_return_hook_cb: https://github.com/tklengyel/drakvuf/blob/af5f81ad434890f89024a23e5e510c8210a95ca8/src/plugins/ipt/ipt.cpp#L285

When monitoring firefox, it occurs that vmi_page_page_lookup_extended is not successful for whatever reason (If you have an idea why this lookup could theoretically fail I would be interested in an answer). If this case occurs, the above mentioned debug message is printed and followed by a return. But the trap itself, which led to the execution of mm_access_fault_return_hook_cb, is not destroyed, as it would, if vmi_page_page_lookup_extended was successful.

That means the trap is still there and somehow the loading of firefox leads to a repetitive call of mm_access_fault_return_hook_cb.

My current fix: In the case of a lookup fail, the trap should be deleted as well. I added plugin->destroy_trap(info->trap); before the return happens.

PRINT_DEBUG("[MEMDUMP] failed to lookup page info\n");
plugin->destroy_trap(info->trap);`
return VMI_EVENT_RESPONSE_NONE;

With this modification I just saw the "failed to lookup page info" once at a time, and I was able to start firefox.

h4b4n3r0 avatar Jan 22 '21 19:01 h4b4n3r0

cc @icedevml as he's the wizard behind IPT, but it does seem a bit similar to libusermode, which I'm quite familiar with.

DRAKVUF can't access unmapped memory, hence this is the way current libusermode works:

  1. Parse the PE header and find image export directory. - create_dll_meta Not readable? Page fault the export directory
  2. Find out the RVA of export - internal_perform_hooking
  3. The first instruction of the exported function is not accessible? Page fault.

image

Since we're having problem with vmi_page_lookup_extended fails, maybe we should page fault the memory (example)?

BonusPlay avatar Jan 23 '21 00:01 BonusPlay

@Id3aFly do you know what memory region triggers those page faults? Is it code, heap, or maybe stack?

If this happens during application startup this could be a result of stack probing (although I'm kind of guessing). So what could probably happen is that program hits a guard page, Windows moves the guard page down, at the same time "lazily" mapping new page. Thus when kernel code is returning from page fault handler, VMI cannot see the page because it's not there yet. In this case, as @BonusPlay said, triggering a fault manually should make this page appear. :thinking:

chivay avatar Jan 23 '21 10:01 chivay

@chivay I have no idea what triggers the page fault. How could I find this out? Or what information would you need to analyse it more?

h4b4n3r0 avatar Jan 23 '21 15:01 h4b4n3r0

Hello Klaus,

thanks for the report, could you share a branch with your current code (but without the workaround you've mentioned)? I will run Firefox under it and will try to debug why is it so.

icedevml avatar Jan 23 '21 21:01 icedevml

I am just discussing with my supervisors if I can release my code (via pull request or somehow other), since it is part of my master's thesis.

But what I described also happens with the current code of the ipt branch: Following code is just copied from the ipt branch.

static event_response_t mm_access_fault_return_hook_cb(drakvuf_t drakvuf, drakvuf_trap_info_t* info)
{
    auto plugin = get_trap_plugin<ipt>(info);
    auto params = get_trap_params<access_fault_result_t>(info);

    if (!params->verify_result_call_params(drakvuf, info))
        return VMI_EVENT_RESPONSE_NONE;

    page_info_t p_info = {};
    {
        auto vmi = vmi_lock_guard(drakvuf);
        if (VMI_SUCCESS != vmi_pagetable_lookup_extended(vmi, info->regs->cr3, params->fault_va, &p_info))
        {
            PRINT_DEBUG("[MEMDUMP] failed to lookup page info\n");
            return VMI_EVENT_RESPONSE_NONE;
        }
    }

    jsonfmt::print("pagefault", drakvuf, info,
                   keyval("CR3", fmt::Xval(info->regs->cr3)),
                   keyval("VA", fmt::Xval(params->fault_va)),
                   keyval("PA", fmt::Xval(p_info.paddr))
                  );

    struct exec_fault_data* ef_data = (struct exec_fault_data*)malloc(sizeof(struct exec_fault_data));
    ef_data->plugin = plugin;
    ef_data->rip = ((params->fault_va >> 12) << 12);
    drakvuf_trap_t* exec_trap = (drakvuf_trap_t*)malloc(sizeof(drakvuf_trap_t));

    exec_trap->type = MEMACCESS;
    exec_trap->memaccess.gfn = p_info.paddr >> 12;
    exec_trap->memaccess.type = PRE;
    exec_trap->memaccess.access = VMI_MEMACCESS_X;
    exec_trap->data = ef_data; // FIXME memleak
    exec_trap->cb = execute_faulted_cb; //THIS METHOD can just be "empty"
    exec_trap->name = nullptr;

    drakvuf_add_trap(drakvuf, exec_trap);
    PRINT_DEBUG("[IPT] Trap X on GFN 0x%lx\n", p_info.paddr >> 12);

    plugin->destroy_trap(info->trap);

    return VMI_EVENT_RESPONSE_NONE;
}

static event_response_t mm_access_fault_hook_cb(drakvuf_t drakvuf, drakvuf_trap_info_t* info)
{
    addr_t fault_va = drakvuf_get_function_argument(drakvuf, info, 2);
    PRINT_DEBUG("[IPT] MmAccessFault(%d, %lx)\n", info->proc_data.pid, fault_va);

    if (fault_va & (1ULL << 63))
    {
        PRINT_DEBUG("[IPT] Don't trap in kernel %d %lx\n", info->proc_data.pid, fault_va);
        return VMI_EVENT_RESPONSE_NONE;
    }

    auto plugin = get_trap_plugin<ipt>(info);

    auto trap = plugin->register_trap<access_fault_result_t>(
                    info,
                    mm_access_fault_return_hook_cb,
                    breakpoint_by_pid_searcher());
    if (!trap)
        return VMI_EVENT_RESPONSE_NONE;

    auto params = get_trap_params<access_fault_result_t>(trap);
    params->set_result_call_params(info);
    params->fault_va = fault_va;

    return VMI_EVENT_RESPONSE_NONE;
}

And in the plugin constructor you have to register the first trap with the callback:

   breakpoint_in_system_process_searcher bp;

    if (!register_trap(nullptr, mm_access_fault_hook_cb, bp.for_syscall_name("MmAccessFault")))
    {
        throw -1;
    }

h4b4n3r0 avatar Jan 26 '21 13:01 h4b4n3r0