capa
capa copied to clipboard
integrate capa with ghidra
lots of people use ghidra, which is free and open source. we should recommend a way of integrating capa results into ghidra.
@psifertex would be interested to hear your feedback on our plan here: http://www.williballenthin.com/post/2020-05-20-sketch-modern-python-in-ghidra/
Hmm, I'm probably not the right person to provide feedback as I have very little JVM experience. One concern might be multithreading with other native plugins--in particular the decompiler. That said, if you limited your usage to either just automated processing where you let analysis run to completion and then use the recovered CFG, as well as a separate plugin to display results, it seems doable to me.
As a general solution it sounds very risky, but for the approach you're taking maybe it's not too bad? Again my experience is limited enough that I wouldn't count it for much.
Another alternative might be to use angr which can integrate much more cleanly (and is already written in python) though I don't have any intuition as to its accuracy relative to viv if that's the reason for wanting another alternative.
You could build a pure java plugin to consume results and display annotations in ghidra but not try to leverage the analysis component on the backend and use angr instead as an alternative to vivisect.
Caveat: I am of course horribly biased and therefore assume everyone will eventually come to know the best solution is BN. 🤣
if that's the reason for wanting another alternative.
i think the primary motivation is that lots of people use ghidra, so if we support ghidra, then more people get access to capa. also, it does provide a solid UI framework through which we can provide neat interfaces, like we've done with IDA. i hate requiring expensive tools to be able to benefit from these research projects (no zing intended, i promise!).
Your front-end is separate from your back-end, no? It might be easier to implement a ghidra front-end with java so it's more native so-to-speak and just give it the ability to integrate results from capa with a much more clearly defined set of APIs.
All I'm proposing is that depending on your goals you might be better served by considering angr as another alternative for back-end processing (which should integrate more easily if you want a viv alternative), but build out the ghidra front-end separately in a way that doesn't risk as much technical debt.
EDIT: Also, no offense perceived! I merely was trying to understand whether the motivator was accuracy, cost + accuracy, or audience size, etc since I couldn't speak to the analysis quality between those three as accurately as i'd like.
You could build a pure java plugin to consume results and display annotations in ghidra
Done. but in python in a cleaner and simpler form.
https://github.com/reb311ion/CapaExplorer

Wow, this looks really cool, @reb311ion!
@reb311ion, doing a capa explorer UI all in Ghidra is possible now.