cuckooml icon indicating copy to clipboard operation
cuckooml copied to clipboard

Resolving abbreviated malware names

Open So-Cool opened this issue 9 years ago • 2 comments

Right now the first mapping which is the longest string matched is used. To improve labelling all possible matches need to be considered and the most probable abbreviation combination i.e. the one that uses all of the sub-strings should be chosen. For example "adload" right now will be split into "a" and "dload" with the latter mapped to downloader. A better split would be "ad" (adware) and "load" (downloader).

So-Cool avatar Jun 09 '16 15:06 So-Cool

How often does this occur? If there are not too many cases, such mappings can be added manually.

hgascon avatar Jun 10 '16 09:06 hgascon

Not too often in the samples that I have to be honest. Nevertheless, as there is quite a number of possible combinations this could be quite useful in general. Let's see what happens with labels when we're at the stage of clustering.

So-Cool avatar Jun 10 '16 12:06 So-Cool