capa icon indicating copy to clipboard operation
capa copied to clipboard

evaluate profiling results from running capa in Ghidra

Open mike-hunhoff opened this issue 1 year ago • 4 comments

Here is a profiling snippet from running capa on mimikatz.exe_ in Ghidra. Let's review and see if there are opportunities to reduce the cumulative times for Ghidra-related functions:

         273564231 function calls (273198325 primitive calls) in 204.308 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.006    0.006  204.326  204.326 main.py:1346(ghidra_main)
        1    0.140    0.140  202.915  202.915 main.py:248(find_capabilities)
     2115    0.563    0.000  158.903    0.075 main.py:189(find_code_capabilities)
    29154    1.346    0.000  121.782    0.004 main.py:149(find_basic_block_capabilities)
   187775    1.357    0.000   99.137    0.001 __init__.py:1375(match)
   375550    4.601    0.000   95.863    0.000 engine.py:290(match)
  3896087    4.826    0.000   89.836    0.000 __init__.py:768(evaluate)
1299570/1069305    5.880    0.000   54.602    0.000 engine.py:138(evaluate)
   156505    1.819    0.000   51.868    0.000 main.py:122(find_instruction_capabilities)
4208856/4130070    9.631    0.000   51.802    0.000 engine.py:105(evaluate)
        1    0.020    0.020   43.212   43.212 main.py:227(find_file_capabilities)
    11246    0.004    0.000   40.881    0.004 extractor.py:34(extract_file_features)
    11246    0.004    0.000   40.877    0.004 file.py:171(extract_features)
   515436   29.252    0.000   33.562    0.000 jepwrappers.py:102(wrapped)
   405942    0.327    0.000   33.242    0.000 extractor.py:64(extract_insn_features)
   405942    1.173    0.000   32.915    0.000 insn.py:410(extract_features)
        1    0.000    0.000   26.628   26.628 file.py:75(extract_file_embedded_pe)
        1    0.005    0.005   26.628   26.628 file.py:26(check_segment_for_pe)
      513    0.013    0.000   26.617    0.052 helpers.py:30(find_byte_sequence)
  6596961   12.473    0.000   23.514    0.000 common.py:169(evaluate)
 49141101    9.370    0.000   21.505    0.000 {built-in method builtins.isinstance}
  1783263    6.563    0.000   21.033    0.000 common.py:387(evaluate)
     9777    0.077    0.000   13.813    0.001 file.py:121(extract_file_strings)
        7   13.215    1.888   13.620    1.946 helpers.py:60(get_block_bytes)
 40463184    5.810    0.000   12.150    0.000 abc.py:117(__instancecheck__)
   284189    2.918    0.000   11.424    0.000 common.py:302(evaluate)
   185659    0.190    0.000   10.457    0.000 extractor.py:59(get_instructions)
   185659    4.195    0.000   10.267    0.000 helpers.py:92(get_insn_in_range)
    95834    9.859    0.000    9.914    0.000 helpers.py:207(check_addr_for_api)
   162176    0.253    0.000    9.744    0.000 insn.py:82(extract_insn_api_features)
 14778790    8.681    0.000    8.681    0.000 common.py:78(__init__)
    68626    0.968    0.000    7.793    0.000 insn.py:32(check_for_api_call)
 16627263    4.487    0.000    6.359    0.000 common.py:123(__hash__)
 40463184    6.338    0.000    6.340    0.000 {built-in method _abc._abc_instancecheck}
  9803117    2.691    0.000    5.676    0.000 {method 'get' of 'dict' objects}
   156647    0.688    0.000    5.401    0.000 insn.py:269(extract_insn_cross_section_cflow)
   614276    0.257    0.000    5.092    0.000 jepwrappers.py:52(get_script)
   620466    0.552    0.000    4.906    0.000 jepwrappers.py:44(get_state)
    13694    0.011    0.000    4.743    0.000 extractor.py:48(extract_function_features)
    13694    0.019    0.000    4.732    0.000 function.py:52(extract_features)
   139179    1.285    0.000    4.550    0.000 common.py:210(evaluate)
    58594    0.070    0.000    4.359    0.000 extractor.py:56(extract_basic_block_features)
    58594    0.090    0.000    4.289    0.000 basicblock.py:121(extract_features)
     2716    2.815    0.001    3.860    0.001 function.py:28(extract_function_loop)
   620466    3.804    0.000    3.804    0.000 jepwrappers.py:32(get_java_thread_id)
   192765    1.274    0.000    3.732    0.000 insn.py:139(extract_insn_offset_features)
   284189    0.344    0.000    3.699    0.000 common.py:356(__init__)
   801181    0.458    0.000    3.515    0.000 {built-in method builtins.any}
   284189    0.943    0.000    3.355    0.000 common.py:284(__init__)
   626020    0.562    0.000    3.311    0.000 helpers.py:234(is_call_or_jmp)
   161807    0.527    0.000    3.126    0.000 engine.py:188(evaluate)
   189149    1.141    0.000    3.081    0.000 insn.py:99(extract_insn_number_features)
   265558    1.161    0.000    3.080    0.000 common.py:437(evaluate)
    31269    0.042    0.000    2.913    0.000 extractor.py:51(get_basic_blocks)
    31269    2.685    0.000    2.871    0.000 helpers.py:84(get_function_blocks)
   156565    2.019    0.000    2.690    0.000 insn.py:161(extract_insn_bytes_features)
    29159    0.024    0.000    2.429    0.000 basicblock.py:99(extract_bb_stackstring)
    29154    1.296    0.000    2.405    0.000 basicblock.py:73(bb_contains_stackstring)
  1824852    2.390    0.000    2.390    0.000 helpers.py:235(<genexpr>)
   159545    2.101    0.000    2.168    0.000 insn.py:188(extract_insn_string_features)
 18060589    2.021    0.000    2.021    0.000 {built-in method builtins.hash}
   154382    1.755    0.000    1.755    0.000 helpers.py:238(is_sp_modified)
    29435    0.030    0.000    1.672    0.000 basicblock.py:107(extract_bb_tight_loop)
    29154    1.068    0.000    1.641    0.000 basicblock.py:87(_bb_has_tight_loop)
   156905    0.885    0.000    1.480    0.000 insn.py:217(extract_insn_obfs_call_plus_5_characteristic_features)
 14771099    1.427    0.000    1.427    0.000 common.py:96(__bool__)
   160217    0.776    0.000    1.307    0.000 helpers.py:245(is_stack_referenced)
    62991    0.295    0.000    1.190    0.000 jepwrappers.py:310(wrapped_currentProgram)
   758654    0.990    0.000    1.122    0.000 common.py:107(__init__)
[...]

mike-hunhoff avatar Aug 17 '23 18:08 mike-hunhoff

        1    0.000    0.000   26.628   26.628 file.py:75(extract_file_embedded_pe)
        1    0.005    0.005   26.628   26.628 file.py:26(check_segment_for_pe)

these stand out. 26 seconds to scan the file bytes for the MZ header? seems slow to me.

williballenthin avatar Aug 17 '23 18:08 williballenthin

        7   13.215    1.888   13.620    1.946 helpers.py:60(get_block_bytes)

this one too

williballenthin avatar Aug 17 '23 18:08 williballenthin

        7   13.215    1.888   13.620    1.946 helpers.py:60(get_block_bytes)

this one too

see https://github.com/mandiant/capa/pull/1761

mike-hunhoff avatar Aug 24 '23 22:08 mike-hunhoff

        1    0.000    0.000   26.628   26.628 file.py:75(extract_file_embedded_pe)
        1    0.005    0.005   26.628   26.628 file.py:26(check_segment_for_pe)

these stand out. 26 seconds to scan the file bytes for the MZ header? seems slow to me.

see https://github.com/mandiant/capa/pull/1767#issuecomment-1694039208

mike-hunhoff avatar Aug 25 '23 23:08 mike-hunhoff