capa
capa copied to clipboard
add support for analysis of source code/scripted languages
This enhancement extends capa's functionality to the analysis of potentially malicious scripts and source code. A tree-sitter backend was added to parse the source files into a lightweight AST. Features akin to the PE-Vivisect capa are then extracted:
File-level:
- trivial: language, file format
- global string literals
- global integer literals
- namespaces
- globally-instantiated imported classes
- globally-called imported functions
Function-level:
- string literals
- integer literals
- imported classes
- imported functions
To install Tree-sitter:
- Pip-install Tree-sitter:
pip3 install tree-sitter
- Install bindings:
mkdir vendor build
cd vendor
git clone [email protected]:tree-sitter/tree-sitter-c-sharp.git
git clone [email protected]:tree-sitter/tree-sitter-embedded-template.git
git clone [email protected]:tree-sitter/tree-sitter-html.git
git clone [email protected]:tree-sitter/tree-sitter-javascript.git
Checklist
- [ ] No CHANGELOG update needed
- [ ] No new tests needed
- [ ] No documentation update needed
i think it would be worthwhile to get the tests running (and passing) in CI. this means:
- add the example files to capa-testfiles and get those merged, and
- update the github actions workflows to install the TS bindings (temporarily, until we have a better solution)
- add the example files to capa-testfiles and get those merged, and
Just submitted the pull request pull request.
- update the github actions workflows to install the TS bindings (temporarily, until we have a better solution)
On it.