cuckooml icon indicating copy to clipboard operation
cuckooml copied to clipboard

Store multiple clustering results in the malware analysis JSON

Open So-Cool opened this issue 9 years ago • 2 comments

At the moment at most one parameter settings and clustering results can be stored per clustering algorithm. It should be extended to allow storing results of clustering for multiple parameter settings. See TODO tags in this commit 1727bb00f64719084e4a9eb8972d82c95f5846c3.

So-Cool avatar Aug 01 '16 15:08 So-Cool

Hey @So-Cool ,

Would like to work on this enhancement feature.

So basically we would want to have a good hash function ,without collisions, that should use parameters ('eps' and 'min_samples' in the case of dbscan and "min_samples" and "min_cluster_size" in the case of hdbscan) to generate the hash? Am I correct?

greninja avatar Jan 23 '17 21:01 greninja

Hi @greninja , that's great that you're willing to work on this. To avoid any kind of mess with your PRs could you please first finalise the other two issues that you are working on?

Hash function is not really necessary, especially that it would need to be bidirectional. One problem is to store it but the other is to retrieve it: we want users to be able to understand what parameters were used to get particular results.

So-Cool avatar Jan 26 '17 09:01 So-Cool