capa icon indicating copy to clipboard operation
capa copied to clipboard

C# code to features

Open mr-tz opened this issue 2 years ago • 2 comments

Most analysts will read decompiled C# code. Can we A) create an utility to parse code segments to features (e.g. using capa-scripts portions) or B) even better allow to include verbatim C# code in rules?

A)

Given a code line

HttpWebRequest r = System.Net.WebRequest.Create(<url>)
r.Method = "Get";

We extract the respective features. EDIT: This would be a separate script (or show-features) used as part of rule writing.

B)

- features:
  - code: >
         HttpWebRequest r = System.Net.WebRequest.Create(<url>)
         r.Method = "Get";

Would this be worth the effort?

Ref

Idea came to me from here:

Ok, I'm having trouble following this. Can you include a comment of an example code snippet?

Originally posted by @mr-tz in https://github.com/mandiant/capa-rules/pull/601#discussion_r942190106

mr-tz avatar Aug 10 '22 08:08 mr-tz

Do you imagine we treat option B as:

  1. regex expression to execute over source code
  2. source code from which capa extracts features for matching (essentially option A integrated directly w/ capa)

mike-hunhoff avatar Aug 16 '22 19:08 mike-hunhoff

The 2. (added a clarification in the original post) The code feature would require adhoc feature extraction...

Con:

  • breaks existing paradigm
  • extracted features not easy to see

Pro:

  • easy to write and read by humans

Currently, I'm leaning towards using a separate step (option A) with a helper script that takes source as input and outputs the extracted features. Then we can add a comment of the code sequence in the rule.

mr-tz avatar Aug 17 '22 07:08 mr-tz