dynamorio icon indicating copy to clipboard operation
dynamorio copied to clipboard

drcachesim post-processing fails on app containing annotations

Open derekbruening opened this issue 4 years ago • 2 comments

I ran Dr. Memory's app_suite_tests just as a sample app for drcachesim's -record_heap and hit an error post-processing it:

$ bin64/drrun -t drcachesim -offline -record_heap -- ~/drmemory/git/build_x64_dbg/tests/app_suite_tests 
$ clients/bin64/drraw2trace -indir drmemtrace.app_suite_tests.223666.3423.dir/ -out /tmp/foo
ERROR: Conversion failed: Failed to process file for thread 223666: memref entry found outside of bb
[drmemtrace]: Thread 223666 timestamp 0x002efe2c8e808cf9
<...>
  0x00007f3db001a7bf leave  %rbp %rsp (%rbp)[8byte] -> %rsp %rbp
[drmemtrace]: Appended memref type 0 size 8 to 0x7ffd76fffbd0
  0x00007f3db001a7c0 ret    %rsp (%rsp)[8byte] -> %rsp
[drmemtrace]: Appended memref type 0 size 8 to 0x7ffd76fffbd8
[drmemtrace]: Appending delayed branch for thread 0
[drmemtrace]: Appending 2 instrs in bb 0x7f46a666b82f in mod 5 +0x5282f = /home/bruening/drmemory/git/build_x64_dbg/tests/app_suite_tests
  0x00007f3db001d82f cmp    %ebx $0x00000001
  0x00007f3db001d832 jnz    $0x00007f3db001d910
[drmemtrace]: Appending delayed branch for thread 0
[drmemtrace]: Appending 16 instrs in bb 0x7f46a666b838 in mod 5 +0x52838 = /home/bruening/drmemory/git/build_x64_dbg/tests/app_suite_tests
  0x00007f3db001d838 mov    $0x000000004d430006 -> 0xffffff70(%rbp)[8byte]
[drmemtrace]: Appended memref type 1 size 8 to 0x7ffd76fffbf0
  0x00007f3db001d843 mov    $0x0000000000000000 -> 0xffffff78(%rbp)[8byte]
[drmemtrace]: Filling in elided addr with remembered rbp: 0x7ffd76fffc80
[drmemtrace]: Appended memref type 1 size 8 to 0x7ffd76fffbf8
  0x00007f3db001d84e mov    $0x0000000000000000 -> 0xffffff80(%rbp)[8byte]
[drmemtrace]: Filling in elided addr with remembered rbp: 0x7ffd76fffc80
[drmemtrace]: Appended memref type 1 size 8 to 0x7ffd76fffc00
  0x00007f3db001d856 mov    $0x0000000000000000 -> 0xffffff88(%rbp)[8byte]
[drmemtrace]: Filling in elided addr with remembered rbp: 0x7ffd76fffc80
[drmemtrace]: Appended memref type 1 size 8 to 0x7ffd76fffc08
  0x00007f3db001d85e mov    $0x0000000000000000 -> 0xffffff90(%rbp)[8byte]
[drmemtrace]: Filling in elided addr with remembered rbp: 0x7ffd76fffc80
[drmemtrace]: Appended memref type 1 size 8 to 0x7ffd76fffc10
  0x00007f3db001d866 mov    $0x0000000000000000 -> 0xffffff98(%rbp)[8byte]
[drmemtrace]: Filling in elided addr with remembered rbp: 0x7ffd76fffc80
[drmemtrace]: Appended memref type 1 size 8 to 0x7ffd76fffc18
  0x00007f3db001d86e lea    0xffffff70(%rbp) -> %rax
  0x00007f3db001d875 mov    $0x00000000 -> %ecx
  0x00007f3db001d87a mov    %ecx -> %edx
  0x00007f3db001d87c rol    $0x0000000000000003 %rdi -> %rdi
  0x00007f3db001d880 rol    $0x000000000000000d %rdi -> %rdi
  0x00007f3db001d884 rol    $0x000000000000003d %rdi -> %rdi
  0x00007f3db001d888 rol    $0x0000000000000033 %rdi -> %rdi
  0x00007f3db001d88c xchg   %rbx %rbx -> %rbx %rbx
  0x00007f3db001d88f mov    %rdx -> %rax
  0x00007f3db001d892 mov    %rax -> 0xffffffa8(%rbp)[8byte]
[drmemtrace]: Filling in elided addr with remembered rbp: 0x7ffd76fffc80
[drmemtrace]: Appended memref type 1 size 8 to 0x7ffd76fffc28
[drmemtrace]: Worker 0 hit error Failed to process file for thread 223666: memref entry found outside of bb on trace thread 0
ERROR: Conversion failed: Failed to process file for thread 223666: memref entry found outside of bb
e86890 00007ffd76fffbd0 20040024000f5290
e868a0 20080024000f5255 802efe2c8e808cf9
e868b0 c203000000000001 20040024000f5262
e868c0 20060024000f526a 00007ffd76fffbc8
<...>
e86b30 00007ffd76fffbd8 2004000a0005282f
e86b40 2020000a00052838 00007ffd76fffc80  <-- 16 instrs +0x52838; 2 memrefs
e86b50 00007ffd76fffbe0 2002000a0004e3a0
e86b60 20180024000c9750 00007ffd76fffbd8

So we expect just 1 memref. No rbp write; lea reads rbp. But the 2nd stored value is not rbp: it's rsp, since it matches the rsp value for the RET plus 8. Why was that written into the trace?

Why does the bb end there, anyway?

Here's the app code:

   52832:       0f 85 d8 00 00 00       jne    52910 <_ZN21MmapTests_SigBus_Test8TestBodyEv+0x244>
   52838:       48 c7 85 70 ff ff ff    movq   $0x4d430006,-0x90(%rbp)
   5283f:       06 00 43 4d 
   52843:       48 c7 85 78 ff ff ff    movq   $0x0,-0x88(%rbp)
   5284a:       00 00 00 00 
   5284e:       48 c7 45 80 00 00 00    movq   $0x0,-0x80(%rbp)
   52855:       00 
   52856:       48 c7 45 88 00 00 00    movq   $0x0,-0x78(%rbp)
   5285d:       00 
   5285e:       48 c7 45 90 00 00 00    movq   $0x0,-0x70(%rbp)
   52865:       00 
   52866:       48 c7 45 98 00 00 00    movq   $0x0,-0x68(%rbp)
   5286d:       00 
   5286e:       48 8d 85 70 ff ff ff    lea    -0x90(%rbp),%rax
   52875:       b9 00 00 00 00          mov    $0x0,%ecx
   5287a:       89 ca                   mov    %ecx,%edx
   5287c:       48 c1 c7 03             rol    $0x3,%rdi
   52880:       48 c1 c7 0d             rol    $0xd,%rdi
   52884:       48 c1 c7 3d             rol    $0x3d,%rdi
   52888:       48 c1 c7 33             rol    $0x33,%rdi
   5288c:       48 87 db                xchg   %rbx,%rbx
   5288f:       48 89 d0                mov    %rdx,%rax
   52892:       48 89 45 a8             mov    %rax,-0x58(%rbp)
   52896:       48 8b 45 a8             mov    -0x58(%rbp),%rax
   5289a:       48 89 45 e0             mov    %rax,-0x20(%rbp)
   5289e:       bf 1e 00 00 00          mov    $0x1e,%edi
   528a3:       e8 f8 ba ff ff          callq  4e3a0 <sysconf@plt>

Huh, the next tag in the raw trace is 4e3a0 for the call: so it looks like this is the rsp for the memref of the call's push. So the bug is that the instr count is 16 instead of 24.

Aha, here's the answer:

interp: start_pc = 0x00007f1a8864d838
  0x00007f1a8864d838  48 c7 85 70 ff ff ff mov    $0x000000004d430006 -> 0xffffff70(%rbp)[8byte]
                      06 00 43 4d
  0x00007f1a8864d843  48 c7 85 78 ff ff ff mov    $0x0000000000000000 -> 0xffffff78(%rbp)[8byte]
                      00 00 00 00
  0x00007f1a8864d84e  48 c7 45 80 00 00 00 mov    $0x0000000000000000 -> 0xffffff80(%rbp)[8byte]
                      00
  0x00007f1a8864d856  48 c7 45 88 00 00 00 mov    $0x0000000000000000 -> 0xffffff88(%rbp)[8byte]
                      00
  0x00007f1a8864d85e  48 c7 45 90 00 00 00 mov    $0x0000000000000000 -> 0xffffff90(%rbp)[8byte]
                      00
  0x00007f1a8864d866  48 c7 45 98 00 00 00 mov    $0x0000000000000000 -> 0xffffff98(%rbp)[8byte]
                      00
  0x00007f1a8864d86e  48 8d 85 70 ff ff ff lea    0xffffff70(%rbp) -> %rax
  0x00007f1a8864d875  b9 00 00 00 00       mov    $0x00000000 -> %ecx
  0x00007f1a8864d87a  89 ca                mov    %ecx -> %edx
  0x00007f1a8864d87c  48 c1 c7 03          rol    $0x0000000000000003 %rdi -> %rdi
        wrote overflow flag before reading it!
  0x00007f1a8864d880  48 c1 c7 0d          rol    $0x000000000000000d %rdi -> %rdi
        wrote overflow flag before reading it!
  0x00007f1a8864d884  48 c1 c7 3d          rol    $0x000000000000003d %rdi -> %rdi
        wrote overflow flag before reading it!
  0x00007f1a8864d888  48 c1 c7 33          rol    $0x0000000000000033 %rdi -> %rdi
        wrote overflow flag before reading it!
  0x00007f1a8864d88c  48 87 db             xchg   %rbx %rbx -> %rbx %rbx
Matched valgrind client request pattern at 0x00007f1a8864d88c
  0x00007f1a8864d88f  48 89 d0             mov    %rdx -> %rax
  0x00007f1a8864d892  48 89 45 a8          mov    %rax -> 0xffffffa8(%rbp)[8byte]
  0x00007f1a8864d896  48 8b 45 a8          mov    0xffffffa8(%rbp)[8byte] -> %rax
  0x00007f1a8864d89a  48 89 45 e0          mov    %rax -> 0xffffffe0(%rbp)[8byte]
  0x00007f1a8864d89e  bf 1e 00 00 00       mov    $0x0000001e -> %edi
  0x00007f1a8864d8a3  e8 f8 ba ff ff       call   $0x00007f1a886493a0 %rsp -> %rsp 0xfffffff8(%rsp)[8byte]
interp: direct call at 0x00007f1a8864d8a3
        NOT following direct call from 0x00007f1a8864d8a3 to 0x00007f1a886493a0
end_pc = 0x00007f1a8864d8a8


instrument_basic_block ******************

before instrumentation:
TAG  0x00007f1a8864d838
 +0    L3 @0x00007f19c92162b8  48 c7 85 70 ff ff ff mov    $0x000000004d430006 -> 0xffffff70(%rbp)[8byte]
                               06 00 43 4d
 +11   L3 @0x00007f19c91d8c58  48 c7 85 78 ff ff ff mov    $0x0000000000000000 -> 0xffffff78(%rbp)[8byte]
                               00 00 00 00
 +22   L3 @0x00007f19c922faa8  48 c7 45 80 00 00 00 mov    $0x0000000000000000 -> 0xffffff80(%rbp)[8byte]
                               00
 +30   L3 @0x00007f19c9c44d30  48 c7 45 88 00 00 00 mov    $0x0000000000000000 -> 0xffffff88(%rbp)[8byte]
                               00
 +38   L3 @0x00007f19c9c4abd0  48 c7 45 90 00 00 00 mov    $0x0000000000000000 -> 0xffffff90(%rbp)[8byte]
                               00
 +46   L3 @0x00007f19c9c32658  48 c7 45 98 00 00 00 mov    $0x0000000000000000 -> 0xffffff98(%rbp)[8byte]
                               00
 +54   L3 @0x00007f19c9c43cc8  48 8d 85 70 ff ff ff lea    0xffffff70(%rbp) -> %rax
 +61   L3 @0x00007f19c9229740  b9 00 00 00 00       mov    $0x00000000 -> %ecx
 +66   L3 @0x00007f19c921d630  89 ca                mov    %ecx -> %edx
 +68   m4 @0x00007f19c921a1c0                       <label>
 +68   L4 @0x00007f19c9c325d8  48 c7 c2 00 00 00 00 mov    $0x0000000000000000 -> %rdx
 +75   L3 @0x00007f19c9c454d8  48 89 d0             mov    %rdx -> %rax
 +78   L3 @0x00007f19c91dcae8  48 89 45 a8          mov    %rax -> 0xffffffa8(%rbp)[8byte]
 +82   L3 @0x00007f19c9c36320  48 8b 45 a8          mov    0xffffffa8(%rbp)[8byte] -> %rax
 +86   L3 @0x00007f19c9c442b0  48 89 45 e0          mov    %rax -> 0xffffffe0(%rbp)[8byte]
 +90   L3 @0x00007f19c9c3d7e8  bf 1e 00 00 00       mov    $0x0000001e -> %edi
 +95   L3 @0x00007f19c9c352d8  e8 f8 ba ff ff       call   $0x00007f1a886493a0 %rsp -> %rsp 0xfffffff8(%rsp)[8byte]
END 0x00007f1a8864d838

So raw2trace needs a way to skip binary annotations.

derekbruening avatar Feb 28 '20 00:02 derekbruening

#2397 had already hit a case of this where dynamorio_annotate_running_on_dynamorio is transformed and so what raw2trace sees doesn't match what the tracer sees, similarly to the anntoation above.

derekbruening avatar Nov 16 '21 20:11 derekbruening

One way to solve this is to use the new encodings file for blocks with annotations, if we can detect them during live tracing: maybe by the gap in the app pc sequence?

derekbruening avatar Sep 15 '22 17:09 derekbruening