pwndbg icon indicating copy to clipboard operation
pwndbg copied to clipboard

[bug] [question] instruction enhancement caching weirdness

Open k4lizen opened this issue 10 months ago • 2 comments

@OBarronCS

I'm confused about this: https://github.com/pwndbg/pwndbg/blob/e83dee14c24f5f4a7bd66d2e2d214f04debb4be8/pwndbg/aglib/disasm/disassembly.py#L55-L57

Does it makes sense? What if someone is in a loop and puts a conditional breakpoint a few instructions forward to let the loop iterate a few times wouldn't the annotation be incorrect?

However, I tried to create a reproducer ~~and didn't succeed :(~~ See comment.

source:

int main (){
  int awa = 0;
  awa += 20;
  awa += 20;
  awa += 20;
  awa += 20;
  awa += 20;
  awa += 20;
  awa += 20;
  awa += 20;
  awa += 20;
  awa += 20;

  for (int i = 0; i < 100; ++i) {
    int x = i;
    int y = x + 4;
    y += 0x44;
    x += 0x1337;
  }

  return 0;
}

Compiled with gcc -g main.c -o main

Expected:

pwndbg> start
pwndbg> b *0x0000555555555158 if i == 50
pwndbg> continue

 ► 0x555555555158 <main+63>    mov    dword ptr [rbp - 8], eax         [0x7fffffffe188] <= 0x32
   0x55555555515b <main+66>    mov    eax, dword ptr [rbp - 8]         EAX, [0x7fffffffe188] => 0x32
   0x55555555515e <main+69>    add    eax, 4                           EAX => 54 (0x32 + 0x4)
   0x555555555161 <main+72>    mov    dword ptr [rbp - 4], eax         [0x7fffffffe18c] <= 0x36
   0x555555555164 <main+75>    add    dword ptr [rbp - 4], 0x44        [0x7fffffffe18c] <= 122 (0x36 + 0x44)
   0x555555555168 <main+79>    add    dword ptr [rbp - 8], 0x1337      [0x7fffffffe188] <= 0x1369 (0x32 + 0x1337)
   0x55555555516f <main+86>    add    dword ptr [rbp - 0x10], 1        [0x7fffffffe180] <= 51 (0x32 + 0x1)

With cond bp:

pwndbg> start
pwndbg> b *0x0000555555555155
pwndbg> continue

 ► 0x555555555155 <main+60>    mov    eax, dword ptr [rbp - 0x10]      EAX, [0x7fffffffe180] => 0
   0x555555555158 <main+63>    mov    dword ptr [rbp - 8], eax         [0x7fffffffe188] <= 0
   0x55555555515b <main+66>    mov    eax, dword ptr [rbp - 8]         EAX, [0x7fffffffe188] => 0
   0x55555555515e <main+69>    add    eax, 4                           EAX => 4 (0 + 4)
   0x555555555161 <main+72>    mov    dword ptr [rbp - 4], eax         [0x7fffffffe18c] <= 4
   0x555555555164 <main+75>    add    dword ptr [rbp - 4], 0x44        [0x7fffffffe18c] <= 72 (0x4 + 0x44)
   0x555555555168 <main+79>    add    dword ptr [rbp - 8], 0x1337      [0x7fffffffe188] <= 0x1337 (0x0 + 0x1337)
   0x55555555516f <main+86>    add    dword ptr [rbp - 0x10], 1        [0x7fffffffe180] <= 1 (0 + 1)

pwndbg> delete 2
pwndbg> b *0x0000555555555158 if i == 50
pwndbg> continue

   0x555555555155 <main+60>    mov    eax, dword ptr [rbp - 0x10]      EAX, [0x7fffffffe180] => 1
 ► 0x555555555158 <main+63>    mov    dword ptr [rbp - 8], eax         [0x7fffffffe188] <= 0x32
   0x55555555515b <main+66>    mov    eax, dword ptr [rbp - 8]         EAX, [0x7fffffffe188] => 0x32
   0x55555555515e <main+69>    add    eax, 4                           EAX => 54 (0x32 + 0x4)
   0x555555555161 <main+72>    mov    dword ptr [rbp - 4], eax         [0x7fffffffe18c] <= 0x36
   0x555555555164 <main+75>    add    dword ptr [rbp - 4], 0x44        [0x7fffffffe18c] <= 122 (0x36 + 0x44)
   0x555555555168 <main+79>    add    dword ptr [rbp - 8], 0x1337      [0x7fffffffe188] <= 0x1369 (0x32 + 0x1337)

It is correct, but I expected it to be wrong, for instance I expected

0x555555555164 <main+75>    add    dword ptr [rbp - 4], 0x44

to be wrongly annotated with [0x7fffffffe18c] <= 72 (0x4 + 0x44) because it is cached.

Doing a bit of print debugging I can see the cache is not invalidated, but get_one_instruction() is called twice on 0x555555555164 the first time it has from_cache=True and fetches the stale instruction, the second time it has from_cache=False and I guess it overwrites it with the correct values.

Why is the same address being fetched twice? Is the cache working correctly?

I might be misunderstanding something, but I was reading https://pwndbg.re/pwndbg/dev/contributing/improving-annotations/#caching-annotations and got a bit confused .

k4lizen avatar May 22 '25 14:05 k4lizen

Oh the instruction cache is only for instructions behind the PC?

Yeah it seems to be bugged:

pwndbg> start
pwndbg> b *0x0000555555555155
pwndbg> continue

 ► 0x555555555155 <main+60>    mov    eax, dword ptr [rbp - 0x10]      EAX, [0x7fffffffe180] => 0
   0x555555555158 <main+63>    mov    dword ptr [rbp - 8], eax         [0x7fffffffe188] <= 0
   0x55555555515b <main+66>    mov    eax, dword ptr [rbp - 8]         EAX, [0x7fffffffe188] => 0
   0x55555555515e <main+69>    add    eax, 4                           EAX => 4 (0 + 4)
   0x555555555161 <main+72>    mov    dword ptr [rbp - 4], eax         [0x7fffffffe18c] <= 4
   0x555555555164 <main+75>    add    dword ptr [rbp - 4], 0x44        [0x7fffffffe18c] <= 72 (0x4 + 0x44)
   0x555555555168 <main+79>    add    dword ptr [rbp - 8], 0x1337      [0x7fffffffe188] <= 0x1337 (0x0 + 0x1337)
   0x55555555516f <main+86>    add    dword ptr [rbp - 0x10], 1        [0x7fffffffe180] <= 1 (0 + 1)

pwndbg> delete 2
pwndbg> b *0x55555555516f if i == 50
pwndbg> continue

   0x55555555515b <main+66>    mov    eax, dword ptr [rbp - 8]         EAX, [0x7fffffffe148] => 0
   0x55555555515e <main+69>    add    eax, 4                           EAX => 4 (0 + 4)
   0x555555555161 <main+72>    mov    dword ptr [rbp - 4], eax         [0x7fffffffe14c] <= 4
   0x555555555164 <main+75>    add    dword ptr [rbp - 4], 0x44        [0x7fffffffe14c] <= 72 (0x4 + 0x44)
   0x555555555168 <main+79>    add    dword ptr [rbp - 8], 0x1337      [0x7fffffffe148] <= 0x1337 (0x0 + 0x1337)
 ► 0x55555555516f <main+86>    add    dword ptr [rbp - 0x10], 1        [0x7fffffffe140] <= 51 (0x32 + 0x1)
   0x555555555173 <main+90>    cmp    dword ptr [rbp - 0x10], 0x63     0x33 - 0x63     EFLAGS => 0x283 [ CF pf af zf SF IF df of ac ]
   0x555555555177 <main+94>  ✔ jle    main+60                     <main+60>
    ↓
   0x555555555155 <main+60>    mov    eax, dword ptr [rbp - 0x10]      EAX, [0x7fffffffe140] => 0x33
   0x555555555158 <main+63>    mov    dword ptr [rbp - 8], eax         [0x7fffffffe148] <= 0x33
   0x55555555515b <main+66>    mov    eax, dword ptr [rbp - 8]         EAX, [0x7fffffffe148] => 0x33

We can see the

0x555555555164 <main+75>    add    dword ptr [rbp - 4], 0x44        [0x7fffffffe14c] <= 72 (0x4 + 0x44)

line is stale.

k4lizen avatar May 22 '25 14:05 k4lizen

Does it makes sense? What if someone is in a loop and puts a conditional breakpoint a few instructions forward to let the loop iterate a few times wouldn't the annotation be incorrect?

Good catch, this behavior only works for unconditional breakpoints. If the breakpoint you are going to is within the few of the context screen, and you continue to it, the instruction in-between will indeed use the "cached" annotations. However, since we used emulation to create these in the first place, and we only step over them once to the breakpoint (they were the only instructions that were executed), then they still should be up to date, since no other mutations should have taken effect to make them stale.

Conditional breakpoints do unfortunately break this, since we don't guarantee we only executed those instructions once. I can look into a way to handle this behavior.

Oh the instruction cache is only for instructions behind the PC?

This is in order to make sure we handle self-modifying code, however given the edge case behavior found in #2979, I've been rethinking this, seeing if there is a better approach. EDIT: on second thought, self-modifying code could equally mutate previous instructions that are in the cache...

OBarronCS avatar May 22 '25 16:05 OBarronCS