Karol Trociński
Karol Trociński
This can yield better results than minhashing whole disassembly by comparing if some of the functions are similar to the others in database.
Malware similarity profile should be standardized into a clean and understandable summary of a malware sample. Something like this. ```json { "profile": { "filename": "filename", "md5": "md5", "sha1": "sha1", "sha256":...
References: * https://neo4j.com/developer/graph-database/
Add maldoc similarity karton based on the embedded images and (or) other characteristics of a document. References: https://github.com/jstrosch/graph-maldoc-similar-images
Add single karton for communicating with aurora. Allows for unit tests for other kartons.
Add karton for adding relationship with similar strings. Optimization ideas: * Store string length in db and choose only strings with length different by only a small factor. References: https://github.com/seatgeek/fuzzywuzzy
* Minhashing functions instead of whole code.
Luqum will allow for using lucerne like query syntax instead of this cursed abomination I'm currently using. References: * https://github.com/jurismarches/luqum
Leave this dirty Jinja templating alone