mwdb-core
mwdb-core copied to clipboard
Use pyspamsum instead of python-ssdeep
Feature Category
- [ ] Correctness
- [ ] User Interface / User Experience
- [ ] Performance
- [x] Other (please explain)
Describe the problem
-
python-ssdeep
requireslibfuzzy-dev
(preinstalled or built by setup.py) which complicates the building process and provides additional requirements that might be difficult to fulfill in some environments. It's minor feature and there are no performance requirements, we don't even use that library for comparison between samples. - https://github.com/CERT-Polska/mwdb-core/issues/411
Describe the solution you'd like
- Use more self-contained library like pyspamsum (https://github.com/freakboy3742/pyspamsum). The drawback is that it doesn't support stream interface, so we need to load the whole file into memory during upload, but we don't use MWDB for huge file upload anyway.
Describe alternatives you've considered
- Find different self-contained library that supports hashing from fd/IO stream
- Write our own implementation :seat:
- Make ssdeep evaluation optional
https://github.com/elceef/ppdeep looks even simpler.