lancelot icon indicating copy to clipboard operation
lancelot copied to clipboard

triage sok function recall

Open williballenthin opened this issue 5 years ago • 2 comments

see https://github.com/williballenthin/lancelot/blob/master/resources/evaluation/SoK/analyze-sok.ipynb

image

pick a testcase:

(env) user@hostname ~/c/l/r/e/SoK> python benchmark.py tee    
lancelot vs SoK test suite
  functions:
    precision: 0.971
    recall:    0.529
  basic blocks:
    precision: 0.990
    recall:    0.813
  instructions:
    precision: 0.998
    recall:    0.812

worst performing test cases:
--------  -----------------------------------
0.319658  SoK-windows-testsuite/cl_O2/tee
0.319658  SoK-windows-testsuite/cl_Ox/tee
0.320613  SoK-windows-testsuite/cl_O1/tee
0.32097   SoK-windows-testsuite/cl_Od/tee
0.737123  SoK-windows-testsuite/cl_m32_O2/tee
0.737123  SoK-windows-testsuite/cl_m32_Ox/tee
0.738281  SoK-windows-testsuite/cl_m32_O1/tee
0.738636  SoK-windows-testsuite/cl_m32_Od/tee
--------  -----------------------------------

dump the functions:

python dump_ground_truth_report.py SoK-windows-testsuite/cl_O2/tee/tee.gt.json.gz | grep function | sort > /tmp/gt-functions.txt
python dump_lancelot_report.py SoK-windows-testsuite/cl_O2/tee/tee.exe | grep function | sort > /tmp/lan-functions.txt

diff:

diff /tmp/gt-functions.txt /tmp/lan-functions.txt | head -n 30                                                                                                          master
2d1
< function: 0x140001010
5,7d3
< function: 0x140001400
< function: 0x140001460
< function: 0x140001484
13,18d8
< function: 0x14000161c
< function: 0x140001630
< function: 0x1400017a8
< function: 0x1400017bc
< function: 0x1400017c4
< function: 0x1400017cc
21d10
< function: 0x14000184c
24d12
< function: 0x1400019d0
27,32d14
< function: 0x140001bc4
< function: 0x140001bdc
< function: 0x140001bfc
< function: 0x140001c08
< function: 0x140001c54
< function: 0x140001c84
34,40d15
< function: 0x140001ccc
< function: 0x140001d00
< function: 0x140001d18
< function: 0x140001d40
< function: 0x140001d58

williballenthin avatar Sep 07 '20 18:09 williballenthin

v0.3.6 9bac44dc7d789a87900e5dcaf17615b37eaf8903

lancelot vs SoK test suite
  functions:
    precision: 0.892
    recall:    0.746
  basic blocks:
    precision: 0.989
    recall:    0.801
  instructions:
    precision: 0.996
    recall:    0.804

worst performing test cases:
--------  ------------------------------------
0.319658  SoK-windows-testsuite/cl_O2/tee
0.319658  SoK-windows-testsuite/cl_Ox/tee
0.320613  SoK-windows-testsuite/cl_O1/tee
0.32097   SoK-windows-testsuite/cl_Od/tee
0.322366  SoK-windows-testsuite/cl_O2/xxd
0.322366  SoK-windows-testsuite/cl_Ox/xxd
0.322727  SoK-windows-testsuite/cl_O1/xxd
0.323776  SoK-windows-testsuite/cl_Od/xxd
0.368915  SoK-windows-testsuite/cl_O2/pageant
0.370444  SoK-windows-testsuite/cl_Ox/pageant
0.373436  SoK-windows-testsuite/cl_O1/pageant
0.374484  SoK-windows-testsuite/cl_O2/puttygen
0.376033  SoK-windows-testsuite/cl_Ox/puttygen
0.382467  SoK-windows-testsuite/cl_O1/puttygen
0.392491  SoK-windows-testsuite/cl_Od/pageant
0.403837  SoK-windows-testsuite/cl_Od/puttygen
0.408605  SoK-windows-testsuite/cl_O2/puttytel
0.409165  SoK-windows-testsuite/cl_Ox/puttytel
0.414866  SoK-windows-testsuite/cl_O1/puttytel
0.430197  SoK-windows-testsuite/cl_Od/puttytel
--------  ------------------------------------

williballenthin avatar Sep 07 '20 19:09 williballenthin

v0.4.2 7a9793979e9e129e95a77785e70179076887d54e

lancelot vs SoK test suite
  functions:
    precision: 0.871 (-0.02)
    recall:    0.850 (+0.11)
  basic blocks:
    precision: 0.987 (no change)
    recall:    0.885 (+0.08)
  instructions:
    precision: 0.995 (no change)
    recall:    0.903 (+0.10)

worst performing function recall:
--------  ------------------------------------
0.540136  SoK-windows-testsuite/cl_O2/tee
0.540136  SoK-windows-testsuite/cl_Ox/tee
0.5403    SoK-windows-testsuite/cl_O1/tee
0.5403    SoK-windows-testsuite/cl_Od/tee
0.544627  SoK-windows-testsuite/cl_O2/xxd
0.544627  SoK-windows-testsuite/cl_Ox/xxd
0.545105  SoK-windows-testsuite/cl_O1/xxd
--------  ------------------------------------

worst performing function precision:
--------  ---------------------------------------
0.454656  SoK-windows-testsuite/cl_Ox/libxml2
0.456754  SoK-windows-testsuite/cl_O2/libxml2
0.517874  SoK-windows-testsuite/cl_O2/tiffcrop
0.520982  SoK-windows-testsuite/cl_O2/vim
0.522531  SoK-windows-testsuite/cl_Ox/tiffcrop
0.541377  SoK-windows-testsuite/cl_Ox/vim

williballenthin avatar Sep 08 '20 20:09 williballenthin