mwdb-core icon indicating copy to clipboard operation
mwdb-core copied to clipboard

Use pyspamsum instead of python-ssdeep

Open psrok1 opened this issue 3 years ago • 1 comments

Feature Category

  • [ ] Correctness
  • [ ] User Interface / User Experience
  • [ ] Performance
  • [x] Other (please explain)

Describe the problem

  • python-ssdeep requires libfuzzy-dev (preinstalled or built by setup.py) which complicates the building process and provides additional requirements that might be difficult to fulfill in some environments. It's minor feature and there are no performance requirements, we don't even use that library for comparison between samples.
  • https://github.com/CERT-Polska/mwdb-core/issues/411

Describe the solution you'd like

  • Use more self-contained library like pyspamsum (https://github.com/freakboy3742/pyspamsum). The drawback is that it doesn't support stream interface, so we need to load the whole file into memory during upload, but we don't use MWDB for huge file upload anyway.

Describe alternatives you've considered

  • Find different self-contained library that supports hashing from fd/IO stream
  • Write our own implementation :seat:
  • Make ssdeep evaluation optional

psrok1 avatar Jul 09 '21 15:07 psrok1

https://github.com/elceef/ppdeep looks even simpler.

jvoisin avatar Sep 17 '22 20:09 jvoisin