diaphora icon indicating copy to clipboard operation
diaphora copied to clipboard

Feature Request: More library friendly

Open jpsnyder opened this issue 2 years ago • 2 comments

When trying to use diaphora as a library for my own tool, I discovered a few things that could be improved: (Note: I haven't pulled some of your latest changes yet, so some of this might be fixed already.)

  • Logs are printed directly to stdout https://github.com/joxeankoret/diaphora/blob/master/diaphora.py#L109 which spams my own tool's output. This leads me to having to use stdout redirection to put them in a proper logger. It would be nice for diaphora to use Python's standard logging library.
  • Getting results are tightly coupled with sqlite databases, which requires me to let diaphora write out the diff results to the database, only to then immediately open up the database and read the results for my own usage. It would be nice if pulling diff results can be exposed as functions before writing out results.
    • For example, I had to reimplement the similarity calculation in my own code since the result is only printed out and not returned.
  • While I understand that this is a IDA only tool, it would be nice if diaphora was setup as a package to make it pip installable and discoverable on PyPi. I've found it easy enough to quickly install a IDA plugin with pip install ida_plugin --target=%IDA_INSTALL_DIR%\python\3 if it is pip installable. Plus this would help me not have to vendor diaphora in my own code.

Obviously, I'm not really using diaphora as it was initially designed, and I may not be the target audience, but it would be nice if diaphora was more Python API friendly.

jpsnyder avatar Feb 17 '23 15:02 jpsnyder

Everything you are mentioning makes sense to me. The 2nd point you mention is problematic, but I believe I can do something to make it easier too.

I'm doing some heavy changes for Diaphora 3.0 (see #193), I will add these tasks.

joxeankoret avatar Feb 18 '23 09:02 joxeankoret

Sounds good. Glad you can incorporate some of these suggestions. I think with some strategic tweaks, diaphora can be a powerful library for users.

If you are curious, I'm using diaphora for batch diffing files. So being able to programmatically call diaphora is key. I use the following ida script to export the database and then diff the databases created locally using CBinDiff and the reimplemented similarity calcuation.

"""
IDA script used to process a file and generate its SQLite database.
"""

import os

def main():
    import idc
    from autodiff.vendor.diaphora import diaphora_ida
    
    os.environ["DIAPHORA_AUTO"] = "1"
    os.environ["DIAPHORA_USE_DECOMPILER"] = "1"
    os.environ["DIAPHORA_EXPORT_FILE"] = idc.ARGV[1]
    
    diaphora_ida.main()


if __name__ == "__main__":
    main()

(It would also be nice to have an export() function that took parameters to do this programatically, instead of my hacky method here.)

jpsnyder avatar Mar 06 '23 19:03 jpsnyder